public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
@ 2013-10-21 16:46 Martin Jambor
  2013-10-22  0:03 ` Steven Bosscher
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Jambor @ 2013-10-21 16:46 UTC (permalink / raw)
  To: GCC Patches; +Cc: Vladimir Makarov

Hi,

in spring I have suggested to shedule pass_cprop_hardreg before
pass_thread_prologue_and_epilogue in order to create many more
shrink-wrapping opportunities.  The problem is that formal arguments
of a functions which need to be saved across calls on slow paths often
get assigned callee saved registers and the very first BB thus needs
prologue, which makes any attempt at shrink-wrapping difficult.

Steven suggested that splitting live-ranges of these registers might
be a better approach (and Vlad suggested that I should look at
http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01137.html, a big thanks
to both) and this is what the patch below does.

The basic idea is to split live ranges of problematic pseudos in the
common dominator of all non-sibling calls and all uses of those
pseudos in BBs which are reachable from calls, if such dominator is
not in a loop and if it does not post-dominate the entry BB.

As far as the number of performed shrink-wrappings is concerned, it
has even bigger effect to scheduling pass_cprop_hardreg earlier that I
have proposed in spring.  Apart from looking at bootstrap and SPEC
2006 (at -Ofast) I have tried compiling Python and Ruby (with whatever
their default flags were, I believe mostly -O3) because David
suggested at his Cauldron talk last year that shrink-wrapping may be
important:

    Number of performed shrink-wrappings:
    | Source                  | Trunk | With patch |       % |
    |-------------------------+-------+------------+---------|
    | SPEC 2006 Int           |   734 |       1670 | +127.52 |
    | SPEC 2006 FP (*)        |   575 |       1022 |  +77.74 |
    | C/CPP bootstrap stage 2 |  1748 |       3015 |  +72.48 |
    | Python 2.7.5            |   231 |        401 |  +73.59 |
    | Python 3.3.2            |   234 |        416 |  +77.78 |
    | Ruby 2.0.0-p247         |   331 |        464 |  +40.18 |

(*) I forgot to increase the stack limit and I believe that is the
reason why 410.bwaves, 416.gamess, 447.dealII and 481.wrf SPEC 2006 FP
benchmarks failed, regardless of my patch, but I have not investigated
what exactly happened, so there might have even been compile-time
issues.

As far as run-times are concerned, I have tried running the benchmarks
from hg.python.org/benchmarks but I do not have anything to report.
David claimed that shrink-wrapping in function lookdict_string is
likely to help but this patch alone does not make that happen, I am
investigating what else needs to be done for this to happen.  I have
not yet tried benchmarking ruby.  Spec 2006 (*) run times are a bit
encouraging.

    Run-times (seconds, median of three runs):
    | Benchmark      | Trunk | Live range splitting |     % |
    |----------------+-------+----------------------+-------|
    | 400.perlbench  |   294 |                  290 | -1.36 |
    | 401.bzip2      |   384 |                  387 | +0.78 |
    | 403.gcc        |   239 |                  237 | -0.84 |
    | 429.mcf        |   227 |                  228 | +0.44 |
    | 445.gobmk      |   392 |                  390 | -0.51 |
    | 456.hmmer      |   343 |                  342 | -0.29 |
    | 458.sjeng      |   418 |                  411 | -1.67 |
    | 462.libquantum |   287 |                  288 | +0.35 |
    | 464.h264ref    |   417 |                  418 | +0.24 |
    | 471.omnetpp    |   275 |                  280 | +1.82 |
    | 473.astar      |   312 |                  295 | -5.45 |
    | 483.xalancbmk  |   183 |                  183 |  0.00 |
    |----------------+-------+----------------------+-------|
    | 433.milc       |   336 |                  335 | -0.30 |
    | 434.zeusmp     |   247 |                  246 | -0.40 |
    | 435.gromacs    |   259 |                  259 |  0.00 |
    | 436.cactusADM  |   192 |                  186 | -3.12 |
    | 437.leslie3d   |   227 |                  226 | -0.44 |
    | 444.namd       |   315 |                  315 |  0.00 |
    | 450.soplex     |   202 |                  200 | -0.99 |
    | 453.povray     |   136 |                  134 | -1.47 |
    | 454.calculix   |   300 |                  300 |  0.00 |
    | 459.GemsFDTD   |   273 |                  267 | -2.20 |
    | 465.tonto      |   685 |                  685 |  0.00 |
    | 470.lbm        |   212 |                  213 | +0.47 |
    | 482.sphinx3    |   363 |                  360 | -0.83 |

The patch is speculative, quite a few modifications do not lead to
shrink wrappings, as outlined in the following table which shows how
many of the changed functions still needed their prologue in the first
BB.

    Number of functions touched:
    |                         |  Modified |         |       |
    | Source                  | functions | In vain |     % |
    |-------------------------+-----------+---------+-------|
    | SPEC 2006 Int           |      2004 |     745 | 37.18 |
    | SPEC 2006 FP (*)        |      1491 |     881 | 59.09 |
    | C/C++ bootstrap stage 2 |      2864 |     935 | 32.65 |
    | Python 2.7.5            |       351 |     121 | 34.47 |
    | Python 3.3.2            |       410 |     151 | 36.83 |
    | Ruby 2.0.0-p247         |       336 |     132 | 39.29 |

I believe we can do even more shrink-wrappings and thus make the above
percentages smaller if attempt to fix a similar problem with the
return value.  The parameters are however clearly the biggest obstacle
and need to be dealt with somehow.

The patch also adds quite some more computations, especially for
functions which are not rejected early on.  The last table below
summarizes how much extra work is done.  Total is the number of
functions on which split_live_ranges_for_shrink_wrap was called, IIUC
function ira, that excludes large functions.

      Number of functions for which:
        1 - all BBs were scanned for calls
        2 - the first BB was scanned for register moves
        3 - dominators were calculated
        4 - loops were calculated
        5 - post-dominators were calculated
        6 - which were modified

    | Source                  | Total |     1 |     % |     2 |     % |    3 |     % |    4 |     % |    5 |     % |    6 |    % |
    |-------------------------+-------+-------+-------+-------+-------+------+-------+------+-------+------+-------+------+------|
    | SPEC 2006 Int           | 30816 | 11888 | 38.58 |  9902 | 32.13 | 7889 | 25.60 | 2911 |  9.45 | 2692 |  8.74 | 2004 | 6.50 |
    | SPEC 2006 FP (*)        | 17957 |  9239 | 51.45 |  7016 | 39.07 | 6507 | 36.24 | 3640 | 20.27 | 3496 | 19.47 | 1491 | 8.30 |
    | C/C++ bootstrap stage 2 | 48163 | 19756 | 41.02 | 16211 | 33.66 | 9598 | 19.93 | 4535 |  9.42 | 4174 |  8.67 | 2864 | 5.95 |
    | Python 2.7.5            |  5472 |  2911 | 53.20 |  2437 | 44.54 | 1987 | 36.31 |  632 | 11.55 |  587 | 10.73 |  351 | 6.41 |
    | Python 3.3.2            |  6156 |  3246 | 52.73 |  2656 | 43.14 | 2165 | 35.17 |  669 | 10.87 |  629 | 10.22 |  410 | 6.66 |
    | Ruby 2.0.0-p247         |  7866 |  3834 | 48.74 |  3333 | 42.37 | 2588 | 32.90 |  797 | 10.13 |  731 |  9.29 |  336 | 4.27 |


The patch itself is below.  It passes bootstrap and testing on
x86_64-linux.  Because it is basically my first RTL-optimization
patch, I expect loads of comments and requests for changes.
Nevertheless, I think it would be nice to have it (or some later
version) in the upcoming release.

Thanks,

Martin


2013-10-18  Martin Jambor  <mjambor@suse.cz>

	PR rtl-optimization/10474
	* ira.c (interesting_dest_for_shprep): New function.
	(split_live_ranges_for_shrink_wrap): Likewise.
	(ira): Call split_live_ranges_for_shrink_wrap.

testsuite/
	* gcc.dg/pr10474.c: New testcase.
	* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
	* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.

diff --git a/gcc/ira.c b/gcc/ira.c
index 203fbff..fe208e2 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -4314,6 +4314,197 @@ find_moveable_pseudos (void)
   free_dominance_info (CDI_DOMINATORS);
 }
 
+
+/* If insn is interesting for parameter range-splitting shring-wrapping
+   preparation, i.e. it is a single set from a hard register to a pseudo, which
+   is live at CALL_DOM, return the destination.  Otherwise return NULL.  */
+
+static rtx
+interesting_dest_for_shprep (rtx insn, basic_block call_dom)
+{
+  rtx set = single_set (insn);
+  if (!set)
+    return NULL;
+  rtx src = SET_SRC (set);
+  rtx dest = SET_DEST (set);
+  if (!REG_P (src) || !HARD_REGISTER_P (src)
+      || !REG_P (dest) || HARD_REGISTER_P (dest)
+      || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest))))
+    return NULL;
+  return dest;
+}
+
+/* Split live ranges of pseudos that are loaded from hard registers in the
+   first BB in a BB that dominates all non-sibling call if such a BB can be
+   found and is not in a loop.  */
+
+static void
+split_live_ranges_for_shrink_wrap (void)
+{
+  basic_block bb, call_dom = NULL;
+  basic_block first = single_succ (ENTRY_BLOCK_PTR);
+  rtx insn, last_interesting_insn = NULL;
+  bitmap_head need_new, reachable;
+  vec<basic_block> queue;
+
+  if (!flag_shrink_wrap)
+    return;
+
+  bitmap_initialize (&need_new, 0);
+  bitmap_initialize (&reachable, 0);
+  queue.create (n_basic_blocks);
+
+  FOR_EACH_BB (bb)
+    FOR_BB_INSNS (bb, insn)
+      if (CALL_P (insn) && !SIBLING_CALL_P (insn))
+	{
+	  if (bb == first)
+	    {
+	      bitmap_clear (&need_new);
+	      bitmap_clear (&reachable);
+	      queue.release ();
+	      return;
+	    }
+
+	  bitmap_set_bit (&need_new, bb->index);
+	  bitmap_set_bit (&reachable, bb->index);
+	  queue.quick_push (bb);
+	  break;
+	}
+
+  if (queue.is_empty ())
+    {
+      bitmap_clear (&need_new);
+      bitmap_clear (&reachable);
+      queue.release ();
+      return;
+    }
+
+  while (!queue.is_empty ())
+    {
+      edge e;
+      edge_iterator ei;
+
+      bb = queue.pop ();
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (e->dest != EXIT_BLOCK_PTR
+	    && bitmap_set_bit (&reachable, e->dest->index))
+	  queue.quick_push (e->dest);
+    }
+  queue.release ();
+
+  FOR_BB_INSNS (first, insn)
+    {
+      rtx dest = interesting_dest_for_shprep (insn, NULL);
+      if (!dest)
+	continue;
+
+      if (DF_REG_DEF_COUNT (REGNO (dest)) > 1)
+	{
+	  bitmap_clear (&need_new);
+	  bitmap_clear (&reachable);
+	  return;
+	}
+
+      for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
+	   use;
+	   use = DF_REF_NEXT_REG (use))
+	{
+	  if (NONDEBUG_INSN_P (DF_REF_INSN (use))
+	      && GET_CODE (DF_REF_REG (use)) == SUBREG)
+	    {
+	      /* This is necessary to avoid hitting an assert at
+		 postreload.c:2294 in libstc++ testcases on x86_64-linux.  I'm
+		 not really sure what the probblem actually is there.  */
+	      bitmap_clear (&need_new);
+	      bitmap_clear (&reachable);
+	      return;
+	    }
+
+	  rtx uin = DF_REF_INSN (use);
+	  int ubbi = BLOCK_FOR_INSN (uin)->index;
+	  if (bitmap_bit_p (&reachable, ubbi))
+	    bitmap_set_bit (&need_new, ubbi);
+	}
+      last_interesting_insn = insn;
+    }
+
+  bitmap_clear (&reachable);
+  if (!last_interesting_insn)
+    {
+      bitmap_clear (&need_new);
+      return;
+    }
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  call_dom = nearest_common_dominator_for_set (CDI_DOMINATORS, &need_new);
+  bitmap_clear (&need_new);
+  if (call_dom == first)
+    goto out;
+
+  loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
+  while (bb_loop_depth (call_dom) > 0)
+    call_dom = get_immediate_dominator (CDI_DOMINATORS, call_dom);
+  loop_optimizer_finalize ();
+
+  if (call_dom == first)
+      goto out;
+
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+  if (dominated_by_p (CDI_POST_DOMINATORS, first, call_dom))
+    {
+      free_dominance_info (CDI_POST_DOMINATORS);
+      goto out;
+    }
+  free_dominance_info (CDI_POST_DOMINATORS);
+
+  if (dump_file)
+    fprintf (dump_file, "Will split live ranges of parameters at BB %i\n",
+	     call_dom->index);
+
+  FOR_BB_INSNS (first, insn)
+    {
+      rtx dest = interesting_dest_for_shprep (insn, call_dom);
+      if (!dest)
+	continue;
+
+      rtx newreg = NULL_RTX;
+      df_ref use, next;
+      for (use = DF_REG_USE_CHAIN (REGNO(dest)); use; use = next)
+	{
+	  rtx uin = DF_REF_INSN (use);
+	  next = DF_REF_NEXT_REG (use);
+
+	  basic_block ubb = BLOCK_FOR_INSN (uin);
+	  if (ubb == call_dom
+	      || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
+	    {
+	      if (!newreg)
+		newreg = ira_create_new_reg (dest);
+	      validate_change (uin, DF_REF_LOC (use), newreg, true);
+	    }
+	}
+
+      if (newreg)
+	{
+	  rtx new_move = gen_move_insn (newreg, dest);
+	  emit_insn_after (new_move, bb_note (call_dom));
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "Split live-range of register ");
+	      print_rtl_single (dump_file, dest);
+	    }
+	}
+
+      if (insn == last_interesting_insn)
+	break;
+    }
+  apply_change_group();
+
+ out:
+  free_dominance_info (CDI_DOMINATORS);
+}
+
 /* Perform the second half of the transformation started in
    find_moveable_pseudos.  We look for instances where the newly introduced
    pseudo remains unallocated, and remove it by moving the definition to
@@ -4522,7 +4713,10 @@ ira (FILE *f)
      allocation because of -O0 usage or because the function is too
      big.  */
   if (ira_conflicts_p)
-    find_moveable_pseudos ();
+    {
+      find_moveable_pseudos ();
+      split_live_ranges_for_shrink_wrap ();
+    }
 
   max_regno_before_ira = max_reg_num ();
   ira_setup_eliminable_regset (true);
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
new file mode 100644
index 0000000..fe497c2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue"  } */
+
+int __attribute__((noinline, noclone))
+foo (int a)
+{
+  return a + 5;
+}
+
+static int g;
+
+int __attribute__((noinline, noclone))
+bar (int a)
+{
+  int r;
+
+  if (a)
+    {
+      r = foo (a);
+      g = r + a;
+    }
+  else
+    r = a+1;
+  return r;
+}
+
+/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
+/* { dg-final { cleanup-rtl-dump "ira" } } */
+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
new file mode 100644
index 0000000..872a757
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue"  } */
+
+int __attribute__((noinline, noclone))
+foo (int a)
+{
+  return a + 5;
+}
+
+static int g;
+
+int __attribute__((noinline, noclone))
+bar (int a)
+{
+  int r;
+
+  if (a)
+    {
+      r = a;
+      while (r < 500)
+	if (r % 2)
+	  r = foo (r);
+	else
+	  r = foo (r+1);
+      g = r + a;
+    }
+  else
+    r = g+1;
+  return r;
+}
+
+/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
+/* { dg-final { cleanup-rtl-dump "ira" } } */
+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */
diff --git a/gcc/testsuite/gcc.dg/pr10474.c b/gcc/testsuite/gcc.dg/pr10474.c
new file mode 100644
index 0000000..ee085c3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr10474.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-pro_and_epilogue"  } */
+
+void f(int *i)
+{
+	if (!i)
+		return;
+	else
+	{
+		__builtin_printf("Hi");
+		*i=0;
+	}
+}
+
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
  2013-10-21 16:46 [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping Martin Jambor
@ 2013-10-22  0:03 ` Steven Bosscher
  2013-10-22  6:52   ` Vladimir Makarov
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Bosscher @ 2013-10-22  0:03 UTC (permalink / raw)
  To: GCC Patches, Vladimir Makarov

On Mon, Oct 21, 2013 at 6:32 PM, Martin Jambor wrote:
> --- a/gcc/ira.c
> +++ b/gcc/ira.c
> @@ -4314,6 +4314,197 @@ find_moveable_pseudos (void)
>    free_dominance_info (CDI_DOMINATORS);
>  }
>
> +
> +/* If insn is interesting for parameter range-splitting shring-wrapping
> +   preparation, i.e. it is a single set from a hard register to a pseudo, which
> +   is live at CALL_DOM, return the destination.  Otherwise return NULL.  */
> +
> +static rtx
> +interesting_dest_for_shprep (rtx insn, basic_block call_dom)
> +{
> +  rtx set = single_set (insn);
> +  if (!set)
> +    return NULL;
> +  rtx src = SET_SRC (set);
> +  rtx dest = SET_DEST (set);
> +  if (!REG_P (src) || !HARD_REGISTER_P (src)
> +      || !REG_P (dest) || HARD_REGISTER_P (dest)
> +      || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest))))

See below...


> +    return NULL;
> +  return dest;
> +}
> +
> +/* Split live ranges of pseudos that are loaded from hard registers in the
> +   first BB in a BB that dominates all non-sibling call if such a BB can be
> +   found and is not in a loop.  */
> +
> +static void
> +split_live_ranges_for_shrink_wrap (void)
> +{
> +  basic_block bb, call_dom = NULL;
> +  basic_block first = single_succ (ENTRY_BLOCK_PTR);
> +  rtx insn, last_interesting_insn = NULL;
> +  bitmap_head need_new, reachable;
> +  vec<basic_block> queue;
> +
> +  if (!flag_shrink_wrap)
> +    return;
> +
> +  bitmap_initialize (&need_new, 0);
> +  bitmap_initialize (&reachable, 0);
> +  queue.create (n_basic_blocks);
> +
> +  FOR_EACH_BB (bb)
> +    FOR_BB_INSNS (bb, insn)
> +      if (CALL_P (insn) && !SIBLING_CALL_P (insn))
> +       {
> +         if (bb == first)
> +           {
> +             bitmap_clear (&need_new);
> +             bitmap_clear (&reachable);
> +             queue.release ();
> +             return;
> +           }
> +
> +         bitmap_set_bit (&need_new, bb->index);
> +         bitmap_set_bit (&reachable, bb->index);
> +         queue.quick_push (bb);
> +         break;
> +       }
> +
> +  if (queue.is_empty ())
> +    {
> +      bitmap_clear (&need_new);
> +      bitmap_clear (&reachable);
> +      queue.release ();
> +      return;
> +    }
> +
> +  while (!queue.is_empty ())
> +    {
> +      edge e;
> +      edge_iterator ei;
> +
> +      bb = queue.pop ();
> +      FOR_EACH_EDGE (e, ei, bb->succs)
> +       if (e->dest != EXIT_BLOCK_PTR
> +           && bitmap_set_bit (&reachable, e->dest->index))
> +         queue.quick_push (e->dest);
> +    }
> +  queue.release ();
> +
> +  FOR_BB_INSNS (first, insn)
> +    {
> +      rtx dest = interesting_dest_for_shprep (insn, NULL);
> +      if (!dest)
> +       continue;
> +
> +      if (DF_REG_DEF_COUNT (REGNO (dest)) > 1)

See below...


> +       {
> +         bitmap_clear (&need_new);
> +         bitmap_clear (&reachable);
> +         return;
> +       }
> +
> +      for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
> +          use;
> +          use = DF_REF_NEXT_REG (use))

You're using DF in these places. But IRA and LRA don't work with DF.
After update_equiv_regs DF caches and liveness may be incorrect. You'd
have to add a df_analyze call but I'm not sure how that will interact
with IRA/LRA's own dataflow frameworks (e.g. w.r.t.
REG_DEAD/REG_UNUSED notes).


> +         rtx uin = DF_REF_INSN (use);
> +         int ubbi = BLOCK_FOR_INSN (uin)->index;

int ubbi = DF_REF_BB (use)?

Ciao!
Steven

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
  2013-10-22  0:03 ` Steven Bosscher
@ 2013-10-22  6:52   ` Vladimir Makarov
  2013-10-23 17:20     ` Martin Jambor
  0 siblings, 1 reply; 10+ messages in thread
From: Vladimir Makarov @ 2013-10-22  6:52 UTC (permalink / raw)
  To: Martin Jambor; +Cc: Steven Bosscher, GCC Patches

On 13-10-21 6:56 PM, Steven Bosscher wrote:
>
> +       {
> +         bitmap_clear (&need_new);
> +         bitmap_clear (&reachable);
> +         return;
> +       }
> +
> +      for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
> +          use;
> +          use = DF_REF_NEXT_REG (use))
> You're using DF in these places. But IRA and LRA don't work with DF.
> After update_equiv_regs DF caches and liveness may be incorrect. You'd
> have to add a df_analyze call but I'm not sure how that will interact
> with IRA/LRA's own dataflow frameworks (e.g. w.r.t.
> REG_DEAD/REG_UNUSED notes).
>
>
Sorry, Martin.  I think Steven is right.  IRA/LRA (and reload pass) 
creates so many changes in RTL that DF infrastructure would slow down 
the compiler a lot and therefore df info is not updated during RA.  Your 
patch mostly uses a correct DF-info because there are few changes since 
updating is off.

You could move your optimization a bit up before df_clear_flags 
(DF_NO_INSN_RESCAN); or move this call right after your optimizations 
(possibly some minor df calls are needed too to restore live info for RA 
after your RTL changes).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
  2013-10-22  6:52   ` Vladimir Makarov
@ 2013-10-23 17:20     ` Martin Jambor
  2013-10-23 23:49       ` Steven Bosscher
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Jambor @ 2013-10-23 17:20 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Steven Bosscher, GCC Patches

Hi,

On Mon, Oct 21, 2013 at 11:00:38PM -0400, Vladimir Makarov wrote:
> On 13-10-21 6:56 PM, Steven Bosscher wrote:
> >
> >+       {
> >+         bitmap_clear (&need_new);
> >+         bitmap_clear (&reachable);
> >+         return;
> >+       }
> >+
> >+      for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
> >+          use;
> >+          use = DF_REF_NEXT_REG (use))
> >You're using DF in these places. But IRA and LRA don't work with DF.
> >After update_equiv_regs DF caches and liveness may be incorrect. You'd
> >have to add a df_analyze call but I'm not sure how that will interact
> >with IRA/LRA's own dataflow frameworks (e.g. w.r.t.
> >REG_DEAD/REG_UNUSED notes).
> >
> >
> Sorry, Martin.  I think Steven is right.  IRA/LRA (and reload pass)
> creates so many changes in RTL that DF infrastructure would slow
> down the compiler a lot and therefore df info is not updated during
> RA.

no need to be sorry, I absolutely anticipated comments and requests
for changes.

>  Your patch mostly uses a correct DF-info because there are few
> changes since updating is off.

I think the reason why it works is that find_moveable_pseudos, which
is called immediately before the function I'm adding, already calls
df_analyze.  I suppose it does not cause any trouble since the call is
there for quite a few months already.  Because find_moveable_pseudos
will never split registers which are defined by a set from a hard
register (because rtx_moveable_p returns false for a HARD_REGISTER_P),
the analysis results should still be perfectly up-to-date for the
purposes of my transformation.

So, do you think that this could be just made more explicit by moving
the call to df_analyze (and dominator calculation) out of
find_moveable_pseudos to ira() as in the (bootstrapped, tested) patch
below?

> You could move your optimization a bit up before df_clear_flags
> (DF_NO_INSN_RESCAN); or move this call right after your
> optimizations (possibly some minor df calls are needed too to
> restore live info for RA after your RTL changes).

I tried this but got dark glibc double free and segfault errors from
deep down in the call to df_analyze in find_moveable_pseudos, so I
quickly chickened out.  I will re-try (or even move the transformation
more up front) if you or Steven reject this attempt.

Thanks,

Martin


2013-10-23  Martin Jambor  <mjambor@suse.cz>

	PR rtl-optimization/10474
	* ira.c (find_moveable_pseudos): Do not calculate dominance info
	nor df analysis.
	(interesting_dest_for_shprep): New function.
	(split_live_ranges_for_shrink_wrap): Likewise.
	(ira): Calculate dominance info and df analysis. Call
	split_live_ranges_for_shrink_wrap.

testsuite/
	* gcc.dg/pr10474.c: New testcase.
	* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
	* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.

diff --git a/gcc/ira.c b/gcc/ira.c
index 203fbff..0faea8f 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3989,9 +3989,6 @@ find_moveable_pseudos (void)
   pseudo_replaced_reg.release ();
   pseudo_replaced_reg.safe_grow_cleared (max_regs);
 
-  df_analyze ();
-  calculate_dominance_info (CDI_DOMINATORS);
-
   i = 0;
   bitmap_initialize (&live, 0);
   bitmap_initialize (&used, 0);
@@ -4311,7 +4308,192 @@ find_moveable_pseudos (void)
   regstat_free_ri ();
   regstat_init_n_sets_and_refs ();
   regstat_compute_ri ();
-  free_dominance_info (CDI_DOMINATORS);
+}
+
+
+/* If insn is interesting for parameter range-splitting shring-wrapping
+   preparation, i.e. it is a single set from a hard register to a pseudo, which
+   is live at CALL_DOM, return the destination.  Otherwise return NULL.  */
+
+static rtx
+interesting_dest_for_shprep (rtx insn, basic_block call_dom)
+{
+  rtx set = single_set (insn);
+  if (!set)
+    return NULL;
+  rtx src = SET_SRC (set);
+  rtx dest = SET_DEST (set);
+  if (!REG_P (src) || !HARD_REGISTER_P (src)
+      || !REG_P (dest) || HARD_REGISTER_P (dest)
+      || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest))))
+    return NULL;
+  return dest;
+}
+
+/* Split live ranges of pseudos that are loaded from hard registers in the
+   first BB in a BB that dominates all non-sibling call if such a BB can be
+   found and is not in a loop.  */
+
+static void
+split_live_ranges_for_shrink_wrap (void)
+{
+  basic_block bb, call_dom = NULL;
+  basic_block first = single_succ (ENTRY_BLOCK_PTR);
+  rtx insn, last_interesting_insn = NULL;
+  bitmap_head need_new, reachable;
+  vec<basic_block> queue;
+
+  if (!flag_shrink_wrap)
+    return;
+
+  bitmap_initialize (&need_new, 0);
+  bitmap_initialize (&reachable, 0);
+  queue.create (n_basic_blocks);
+
+  FOR_EACH_BB (bb)
+    FOR_BB_INSNS (bb, insn)
+      if (CALL_P (insn) && !SIBLING_CALL_P (insn))
+	{
+	  if (bb == first)
+	    {
+	      bitmap_clear (&need_new);
+	      bitmap_clear (&reachable);
+	      queue.release ();
+	      return;
+	    }
+
+	  bitmap_set_bit (&need_new, bb->index);
+	  bitmap_set_bit (&reachable, bb->index);
+	  queue.quick_push (bb);
+	  break;
+	}
+
+  if (queue.is_empty ())
+    {
+      bitmap_clear (&need_new);
+      bitmap_clear (&reachable);
+      queue.release ();
+      return;
+    }
+
+  while (!queue.is_empty ())
+    {
+      edge e;
+      edge_iterator ei;
+
+      bb = queue.pop ();
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (e->dest != EXIT_BLOCK_PTR
+	    && bitmap_set_bit (&reachable, e->dest->index))
+	  queue.quick_push (e->dest);
+    }
+  queue.release ();
+
+  FOR_BB_INSNS (first, insn)
+    {
+      rtx dest = interesting_dest_for_shprep (insn, NULL);
+      if (!dest)
+	continue;
+
+      if (DF_REG_DEF_COUNT (REGNO (dest)) > 1)
+	{
+	  bitmap_clear (&need_new);
+	  bitmap_clear (&reachable);
+	  return;
+	}
+
+      for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
+	   use;
+	   use = DF_REF_NEXT_REG (use))
+	{
+	  if (NONDEBUG_INSN_P (DF_REF_INSN (use))
+	      && GET_CODE (DF_REF_REG (use)) == SUBREG)
+	    {
+	      /* This is necessary to avoid hitting an assert at
+		 postreload.c:2294 in libstc++ testcases on x86_64-linux.  I'm
+		 not really sure what the probblem actually is there.  */
+	      bitmap_clear (&need_new);
+	      bitmap_clear (&reachable);
+	      return;
+	    }
+
+	  int ubbi = DF_REF_BB (use)->index;
+	  if (bitmap_bit_p (&reachable, ubbi))
+	    bitmap_set_bit (&need_new, ubbi);
+	}
+      last_interesting_insn = insn;
+    }
+
+  bitmap_clear (&reachable);
+  if (!last_interesting_insn)
+    {
+      bitmap_clear (&need_new);
+      return;
+    }
+
+  call_dom = nearest_common_dominator_for_set (CDI_DOMINATORS, &need_new);
+  bitmap_clear (&need_new);
+  if (call_dom == first)
+    return;
+
+  loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
+  while (bb_loop_depth (call_dom) > 0)
+    call_dom = get_immediate_dominator (CDI_DOMINATORS, call_dom);
+  loop_optimizer_finalize ();
+
+  if (call_dom == first)
+    return;
+
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+  if (dominated_by_p (CDI_POST_DOMINATORS, first, call_dom))
+    {
+      free_dominance_info (CDI_POST_DOMINATORS);
+      return;
+    }
+  free_dominance_info (CDI_POST_DOMINATORS);
+
+  if (dump_file)
+    fprintf (dump_file, "Will split live ranges of parameters at BB %i\n",
+	     call_dom->index);
+
+  FOR_BB_INSNS (first, insn)
+    {
+      rtx dest = interesting_dest_for_shprep (insn, call_dom);
+      if (!dest)
+	continue;
+
+      rtx newreg = NULL_RTX;
+      df_ref use, next;
+      for (use = DF_REG_USE_CHAIN (REGNO(dest)); use; use = next)
+	{
+	  rtx uin = DF_REF_INSN (use);
+	  next = DF_REF_NEXT_REG (use);
+
+	  basic_block ubb = BLOCK_FOR_INSN (uin);
+	  if (ubb == call_dom
+	      || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
+	    {
+	      if (!newreg)
+		newreg = ira_create_new_reg (dest);
+	      validate_change (uin, DF_REF_LOC (use), newreg, true);
+	    }
+	}
+
+      if (newreg)
+	{
+	  rtx new_move = gen_move_insn (newreg, dest);
+	  emit_insn_after (new_move, bb_note (call_dom));
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "Split live-range of register ");
+	      print_rtl_single (dump_file, dest);
+	    }
+	}
+
+      if (insn == last_interesting_insn)
+	break;
+    }
+  apply_change_group ();
 }
 
 /* Perform the second half of the transformation started in
@@ -4522,7 +4704,15 @@ ira (FILE *f)
      allocation because of -O0 usage or because the function is too
      big.  */
   if (ira_conflicts_p)
-    find_moveable_pseudos ();
+    {
+      df_analyze ();
+      calculate_dominance_info (CDI_DOMINATORS);
+
+      find_moveable_pseudos ();
+      split_live_ranges_for_shrink_wrap ();
+
+      free_dominance_info (CDI_DOMINATORS);
+    }
 
   max_regno_before_ira = max_reg_num ();
   ira_setup_eliminable_regset (true);
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
new file mode 100644
index 0000000..fe497c2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue"  } */
+
+int __attribute__((noinline, noclone))
+foo (int a)
+{
+  return a + 5;
+}
+
+static int g;
+
+int __attribute__((noinline, noclone))
+bar (int a)
+{
+  int r;
+
+  if (a)
+    {
+      r = foo (a);
+      g = r + a;
+    }
+  else
+    r = a+1;
+  return r;
+}
+
+/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
+/* { dg-final { cleanup-rtl-dump "ira" } } */
+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
new file mode 100644
index 0000000..872a757
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue"  } */
+
+int __attribute__((noinline, noclone))
+foo (int a)
+{
+  return a + 5;
+}
+
+static int g;
+
+int __attribute__((noinline, noclone))
+bar (int a)
+{
+  int r;
+
+  if (a)
+    {
+      r = a;
+      while (r < 500)
+	if (r % 2)
+	  r = foo (r);
+	else
+	  r = foo (r+1);
+      g = r + a;
+    }
+  else
+    r = g+1;
+  return r;
+}
+
+/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
+/* { dg-final { cleanup-rtl-dump "ira" } } */
+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */
diff --git a/gcc/testsuite/gcc.dg/pr10474.c b/gcc/testsuite/gcc.dg/pr10474.c
new file mode 100644
index 0000000..ee085c3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr10474.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-pro_and_epilogue"  } */
+
+void f(int *i)
+{
+	if (!i)
+		return;
+	else
+	{
+		__builtin_printf("Hi");
+		*i=0;
+	}
+}
+
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
  2013-10-23 17:20     ` Martin Jambor
@ 2013-10-23 23:49       ` Steven Bosscher
  2013-10-25 15:49         ` Martin Jambor
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Bosscher @ 2013-10-23 23:49 UTC (permalink / raw)
  To: Vladimir Makarov, Steven Bosscher, GCC Patches

On Wed, Oct 23, 2013 at 6:46 PM, Martin Jambor wrote:

>  /* Perform the second half of the transformation started in
> @@ -4522,7 +4704,15 @@ ira (FILE *f)
>       allocation because of -O0 usage or because the function is too
>       big.  */
>    if (ira_conflicts_p)
> -    find_moveable_pseudos ();
> +    {
> +      df_analyze ();
> +      calculate_dominance_info (CDI_DOMINATORS);
> +
> +      find_moveable_pseudos ();
> +      split_live_ranges_for_shrink_wrap ();
> +
> +      free_dominance_info (CDI_DOMINATORS);
> +    }
>

You probably want to add another df_analyze if
split_live_ranges_for_shrink_wrap makes code transformations. AFAIU
find_moveable_pseudos doesn't change global liveness but your
transformation might. IRA/LRA need up-to-date DF_LR results to compute
allocno live ranges.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
  2013-10-23 23:49       ` Steven Bosscher
@ 2013-10-25 15:49         ` Martin Jambor
  2013-10-29 16:26           ` Vladimir Makarov
  2013-10-30 23:53           ` Jakub Jelinek
  0 siblings, 2 replies; 10+ messages in thread
From: Martin Jambor @ 2013-10-25 15:49 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Vladimir Makarov, GCC Patches

Hi,

On Thu, Oct 24, 2013 at 01:02:51AM +0200, Steven Bosscher wrote:
> On Wed, Oct 23, 2013 at 6:46 PM, Martin Jambor wrote:
> 
> >  /* Perform the second half of the transformation started in
> > @@ -4522,7 +4704,15 @@ ira (FILE *f)
> >       allocation because of -O0 usage or because the function is too
> >       big.  */
> >    if (ira_conflicts_p)
> > -    find_moveable_pseudos ();
> > +    {
> > +      df_analyze ();
> > +      calculate_dominance_info (CDI_DOMINATORS);
> > +
> > +      find_moveable_pseudos ();
> > +      split_live_ranges_for_shrink_wrap ();
> > +
> > +      free_dominance_info (CDI_DOMINATORS);
> > +    }
> >
> 
> You probably want to add another df_analyze if
> split_live_ranges_for_shrink_wrap makes code transformations. AFAIU
> find_moveable_pseudos doesn't change global liveness but your
> transformation might. IRA/LRA need up-to-date DF_LR results to compute
> allocno live ranges.
> 

OK, I have changed the patch to fo that (it is below, still bootstraps
and passes tests on x86_64 fine).  However, I have noticed that the
corresponding part in function ira now looks like:

  /* ... */
  if (delete_trivially_dead_insns (get_insns (), max_reg_num ()))
    df_analyze ();

  /* It is not worth to do such improvement when we use a simple
     allocation because of -O0 usage or because the function is too
     big.  */
  if (ira_conflicts_p)
    {
      df_analyze ();
      calculate_dominance_info (CDI_DOMINATORS);

      find_moveable_pseudos ();
      if (split_live_ranges_for_shrink_wrap ())
	df_analyze ();

      free_dominance_info (CDI_DOMINATORS);
    }
  /* ... */

So, that left me wondering whether the first call to df_analyze is
actually necessary, or whether perhaps the data are actually already
up to date.  What do you think?

Thanks for all the feedback,

Martin


2013-10-23  Martin Jambor  <mjambor@suse.cz>

	PR rtl-optimization/10474
	* ira.c (find_moveable_pseudos): Do not calculate dominance info
	nor df analysis.
	(interesting_dest_for_shprep): New function.
	(split_live_ranges_for_shrink_wrap): Likewise.
	(ira): Calculate dominance info and df analysis. Call
	split_live_ranges_for_shrink_wrap.

testsuite/
	* gcc.dg/pr10474.c: New testcase.
	* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
	* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.

diff --git a/gcc/ira.c b/gcc/ira.c
index 203fbff..532db31 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3989,9 +3989,6 @@ find_moveable_pseudos (void)
   pseudo_replaced_reg.release ();
   pseudo_replaced_reg.safe_grow_cleared (max_regs);
 
-  df_analyze ();
-  calculate_dominance_info (CDI_DOMINATORS);
-
   i = 0;
   bitmap_initialize (&live, 0);
   bitmap_initialize (&used, 0);
@@ -4311,7 +4308,196 @@ find_moveable_pseudos (void)
   regstat_free_ri ();
   regstat_init_n_sets_and_refs ();
   regstat_compute_ri ();
-  free_dominance_info (CDI_DOMINATORS);
+}
+
+
+/* If insn is interesting for parameter range-splitting shring-wrapping
+   preparation, i.e. it is a single set from a hard register to a pseudo, which
+   is live at CALL_DOM, return the destination.  Otherwise return NULL.  */
+
+static rtx
+interesting_dest_for_shprep (rtx insn, basic_block call_dom)
+{
+  rtx set = single_set (insn);
+  if (!set)
+    return NULL;
+  rtx src = SET_SRC (set);
+  rtx dest = SET_DEST (set);
+  if (!REG_P (src) || !HARD_REGISTER_P (src)
+      || !REG_P (dest) || HARD_REGISTER_P (dest)
+      || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest))))
+    return NULL;
+  return dest;
+}
+
+/* Split live ranges of pseudos that are loaded from hard registers in the
+   first BB in a BB that dominates all non-sibling call if such a BB can be
+   found and is not in a loop.  Return true if the function has made any
+   changes.  */
+
+static bool
+split_live_ranges_for_shrink_wrap (void)
+{
+  basic_block bb, call_dom = NULL;
+  basic_block first = single_succ (ENTRY_BLOCK_PTR);
+  rtx insn, last_interesting_insn = NULL;
+  bitmap_head need_new, reachable;
+  vec<basic_block> queue;
+
+  if (!flag_shrink_wrap)
+    return false;
+
+  bitmap_initialize (&need_new, 0);
+  bitmap_initialize (&reachable, 0);
+  queue.create (n_basic_blocks);
+
+  FOR_EACH_BB (bb)
+    FOR_BB_INSNS (bb, insn)
+      if (CALL_P (insn) && !SIBLING_CALL_P (insn))
+	{
+	  if (bb == first)
+	    {
+	      bitmap_clear (&need_new);
+	      bitmap_clear (&reachable);
+	      queue.release ();
+	      return false;
+	    }
+
+	  bitmap_set_bit (&need_new, bb->index);
+	  bitmap_set_bit (&reachable, bb->index);
+	  queue.quick_push (bb);
+	  break;
+	}
+
+  if (queue.is_empty ())
+    {
+      bitmap_clear (&need_new);
+      bitmap_clear (&reachable);
+      queue.release ();
+      return false;
+    }
+
+  while (!queue.is_empty ())
+    {
+      edge e;
+      edge_iterator ei;
+
+      bb = queue.pop ();
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (e->dest != EXIT_BLOCK_PTR
+	    && bitmap_set_bit (&reachable, e->dest->index))
+	  queue.quick_push (e->dest);
+    }
+  queue.release ();
+
+  FOR_BB_INSNS (first, insn)
+    {
+      rtx dest = interesting_dest_for_shprep (insn, NULL);
+      if (!dest)
+	continue;
+
+      if (DF_REG_DEF_COUNT (REGNO (dest)) > 1)
+	{
+	  bitmap_clear (&need_new);
+	  bitmap_clear (&reachable);
+	  return false;
+	}
+
+      for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
+	   use;
+	   use = DF_REF_NEXT_REG (use))
+	{
+	  if (NONDEBUG_INSN_P (DF_REF_INSN (use))
+	      && GET_CODE (DF_REF_REG (use)) == SUBREG)
+	    {
+	      /* This is necessary to avoid hitting an assert at
+		 postreload.c:2294 in libstc++ testcases on x86_64-linux.  I'm
+		 not really sure what the probblem actually is there.  */
+	      bitmap_clear (&need_new);
+	      bitmap_clear (&reachable);
+	      return false;
+	    }
+
+	  int ubbi = DF_REF_BB (use)->index;
+	  if (bitmap_bit_p (&reachable, ubbi))
+	    bitmap_set_bit (&need_new, ubbi);
+	}
+      last_interesting_insn = insn;
+    }
+
+  bitmap_clear (&reachable);
+  if (!last_interesting_insn)
+    {
+      bitmap_clear (&need_new);
+      return false;
+    }
+
+  call_dom = nearest_common_dominator_for_set (CDI_DOMINATORS, &need_new);
+  bitmap_clear (&need_new);
+  if (call_dom == first)
+    return false;
+
+  loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
+  while (bb_loop_depth (call_dom) > 0)
+    call_dom = get_immediate_dominator (CDI_DOMINATORS, call_dom);
+  loop_optimizer_finalize ();
+
+  if (call_dom == first)
+    return false;
+
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+  if (dominated_by_p (CDI_POST_DOMINATORS, first, call_dom))
+    {
+      free_dominance_info (CDI_POST_DOMINATORS);
+      return false;
+    }
+  free_dominance_info (CDI_POST_DOMINATORS);
+
+  if (dump_file)
+    fprintf (dump_file, "Will split live ranges of parameters at BB %i\n",
+	     call_dom->index);
+
+  bool ret = false;
+  FOR_BB_INSNS (first, insn)
+    {
+      rtx dest = interesting_dest_for_shprep (insn, call_dom);
+      if (!dest)
+	continue;
+
+      rtx newreg = NULL_RTX;
+      df_ref use, next;
+      for (use = DF_REG_USE_CHAIN (REGNO(dest)); use; use = next)
+	{
+	  rtx uin = DF_REF_INSN (use);
+	  next = DF_REF_NEXT_REG (use);
+
+	  basic_block ubb = BLOCK_FOR_INSN (uin);
+	  if (ubb == call_dom
+	      || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
+	    {
+	      if (!newreg)
+		newreg = ira_create_new_reg (dest);
+	      validate_change (uin, DF_REF_LOC (use), newreg, true);
+	    }
+	}
+
+      if (newreg)
+	{
+	  rtx new_move = gen_move_insn (newreg, dest);
+	  emit_insn_after (new_move, bb_note (call_dom));
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "Split live-range of register ");
+	      print_rtl_single (dump_file, dest);
+	    }
+	  ret = true;
+	}
+
+      if (insn == last_interesting_insn)
+	break;
+    }
+  apply_change_group ();
+  return ret;
 }
 
 /* Perform the second half of the transformation started in
@@ -4522,7 +4708,16 @@ ira (FILE *f)
      allocation because of -O0 usage or because the function is too
      big.  */
   if (ira_conflicts_p)
-    find_moveable_pseudos ();
+    {
+      df_analyze ();
+      calculate_dominance_info (CDI_DOMINATORS);
+
+      find_moveable_pseudos ();
+      if (split_live_ranges_for_shrink_wrap ())
+	df_analyze ();
+
+      free_dominance_info (CDI_DOMINATORS);
+    }
 
   max_regno_before_ira = max_reg_num ();
   ira_setup_eliminable_regset (true);
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
new file mode 100644
index 0000000..fe497c2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue"  } */
+
+int __attribute__((noinline, noclone))
+foo (int a)
+{
+  return a + 5;
+}
+
+static int g;
+
+int __attribute__((noinline, noclone))
+bar (int a)
+{
+  int r;
+
+  if (a)
+    {
+      r = foo (a);
+      g = r + a;
+    }
+  else
+    r = a+1;
+  return r;
+}
+
+/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
+/* { dg-final { cleanup-rtl-dump "ira" } } */
+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
new file mode 100644
index 0000000..872a757
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue"  } */
+
+int __attribute__((noinline, noclone))
+foo (int a)
+{
+  return a + 5;
+}
+
+static int g;
+
+int __attribute__((noinline, noclone))
+bar (int a)
+{
+  int r;
+
+  if (a)
+    {
+      r = a;
+      while (r < 500)
+	if (r % 2)
+	  r = foo (r);
+	else
+	  r = foo (r+1);
+      g = r + a;
+    }
+  else
+    r = g+1;
+  return r;
+}
+
+/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
+/* { dg-final { cleanup-rtl-dump "ira" } } */
+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */
diff --git a/gcc/testsuite/gcc.dg/pr10474.c b/gcc/testsuite/gcc.dg/pr10474.c
new file mode 100644
index 0000000..ee085c3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr10474.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-rtl-pro_and_epilogue"  } */
+
+void f(int *i)
+{
+	if (!i)
+		return;
+	else
+	{
+		__builtin_printf("Hi");
+		*i=0;
+	}
+}
+
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
  2013-10-25 15:49         ` Martin Jambor
@ 2013-10-29 16:26           ` Vladimir Makarov
  2013-10-30 23:53           ` Jakub Jelinek
  1 sibling, 0 replies; 10+ messages in thread
From: Vladimir Makarov @ 2013-10-29 16:26 UTC (permalink / raw)
  To: Martin Jambor; +Cc: GCC Patches

On 10/25/2013 11:19 AM, Martin Jambor wrote:
> Hi,
>
> On Thu, Oct 24, 2013 at 01:02:51AM +0200, Steven Bosscher wrote:
>> On Wed, Oct 23, 2013 at 6:46 PM, Martin Jambor wrote:
>>
>>>  /* Perform the second half of the transformation started in
>>> @@ -4522,7 +4704,15 @@ ira (FILE *f)
>>>       allocation because of -O0 usage or because the function is too
>>>       big.  */
>>>    if (ira_conflicts_p)
>>> -    find_moveable_pseudos ();
>>> +    {
>>> +      df_analyze ();
>>> +      calculate_dominance_info (CDI_DOMINATORS);
>>> +
>>> +      find_moveable_pseudos ();
>>> +      split_live_ranges_for_shrink_wrap ();
>>> +
>>> +      free_dominance_info (CDI_DOMINATORS);
>>> +    }
>>>
>> You probably want to add another df_analyze if
>> split_live_ranges_for_shrink_wrap makes code transformations. AFAIU
>> find_moveable_pseudos doesn't change global liveness but your
>> transformation might. IRA/LRA need up-to-date DF_LR results to compute
>> allocno live ranges.
>>
> OK, I have changed the patch to fo that (it is below, still bootstraps
> and passes tests on x86_64 fine).  However, I have noticed that the
> corresponding part in function ira now looks like:
>
>   /* ... */
>   if (delete_trivially_dead_insns (get_insns (), max_reg_num ()))
>     df_analyze ();
>
>   /* It is not worth to do such improvement when we use a simple
>      allocation because of -O0 usage or because the function is too
>      big.  */
>   if (ira_conflicts_p)
>     {
>       df_analyze ();
>       calculate_dominance_info (CDI_DOMINATORS);
>
>       find_moveable_pseudos ();
>       if (split_live_ranges_for_shrink_wrap ())
> 	df_analyze ();
>
>       free_dominance_info (CDI_DOMINATORS);
>     }
>   /* ... */
>
> So, that left me wondering whether the first call to df_analyze is
> actually necessary, or whether perhaps the data are actually already
> up to date.  What do you think?
I guess it needs some investigation.  delete_trivially_dead_insns code
was taken from the old RA.  First of all, I don't know how many insns
are really trivially dead before RA in optimization and non-optimization
mode.  May be the code can be removed at all.  I'll put it on my todo list.
The patch is ok to commit.  Thanks for working on this, Martin.
>
> 2013-10-23  Martin Jambor  <mjambor@suse.cz>
>
> 	PR rtl-optimization/10474
> 	* ira.c (find_moveable_pseudos): Do not calculate dominance info
> 	nor df analysis.
> 	(interesting_dest_for_shprep): New function.
> 	(split_live_ranges_for_shrink_wrap): Likewise.
> 	(ira): Calculate dominance info and df analysis. Call
> 	split_live_ranges_for_shrink_wrap.
>
> testsuite/
> 	* gcc.dg/pr10474.c: New testcase.
> 	* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
> 	* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
  2013-10-25 15:49         ` Martin Jambor
  2013-10-29 16:26           ` Vladimir Makarov
@ 2013-10-30 23:53           ` Jakub Jelinek
  2013-10-31  7:02             ` ICE with "[PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping" Hans-Peter Nilsson
  2013-10-31 14:13             ` [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping Martin Jambor
  1 sibling, 2 replies; 10+ messages in thread
From: Jakub Jelinek @ 2013-10-30 23:53 UTC (permalink / raw)
  To: Steven Bosscher, Vladimir Makarov, GCC Patches

On Fri, Oct 25, 2013 at 05:19:06PM +0200, Martin Jambor wrote:
> 2013-10-23  Martin Jambor  <mjambor@suse.cz>
> 
> 	PR rtl-optimization/10474
> 	* ira.c (find_moveable_pseudos): Do not calculate dominance info
> 	nor df analysis.
> 	(interesting_dest_for_shprep): New function.
> 	(split_live_ranges_for_shrink_wrap): Likewise.
> 	(ira): Calculate dominance info and df analysis. Call
> 	split_live_ranges_for_shrink_wrap.
> 
> testsuite/
> 	* gcc.dg/pr10474.c: New testcase.
> 	* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
> 	* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.

Unfortunately this patch breaks i686-linux bootstrap,
in r204204 compare passes, while in r204205 I'm getting .bad_compare
gcc/fortran/module.o differs                                                                                                                       
gcc/ipa.o differs                                                                                                                                  
gcc/go/gogo.o differs                                                                                                                              
gcc/go/statements.o differs                                                                                                                        
Most likely combine.o is miscompiled, but haven't verified that yet.

The way I'm configuring this on x86_64-linux is:
mkdir ~/hbin
cat > ~/hbin/as <<\EOF2
#!/bin/sh
exec /usr/bin/as --32 "$@"
EOF2
cat > ~/hbin/g++ <<\EOF2
#!/bin/sh
exec /usr/bin/g++ -m32 "$@"
EOF2
cat > ~/hbin/gcc <<\EOF2
#!/bin/sh
exec /usr/bin/gcc -m32 "$@"
EOF2
cat > ~/hbin/ld <<\EOF2
#!/bin/sh
case "$*" in
  --version) cat <<\EOF
GNU ld version 2.20.52.0.1-10.fc17 20100131
Copyright 2012 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later
version.
This program has absolutely no warranty.
EOF
  exit 0;;
esac
exec /usr/bin/ld -m elf_i386 -L /usr/lib/ "$@"
EOF2
chmod 755 ~/hbin/*
PATH=~/hbin:$PATH i386 ../configure --enable-languages=all,obj-c++,lto,go --enable-checking=yes,rtl
PATH=~/hbin:$PATH i386 make -j48

	Jakub

^ permalink raw reply	[flat|nested] 10+ messages in thread

* ICE with "[PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping"
  2013-10-30 23:53           ` Jakub Jelinek
@ 2013-10-31  7:02             ` Hans-Peter Nilsson
  2013-10-31 14:13             ` [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping Martin Jambor
  1 sibling, 0 replies; 10+ messages in thread
From: Hans-Peter Nilsson @ 2013-10-31  7:02 UTC (permalink / raw)
  To: mjambor; +Cc: stevenb.gcc, vmakarov, gcc-patches, jakub

> From: Jakub Jelinek <jakub@redhat.com>
> Date: Thu, 31 Oct 2013 00:16:41 +0100

> On Fri, Oct 25, 2013 at 05:19:06PM +0200, Martin Jambor wrote:
> > 2013-10-23  Martin Jambor  <mjambor@suse.cz>
> > 
> > 	PR rtl-optimization/10474
> > 	* ira.c (find_moveable_pseudos): Do not calculate dominance info
> > 	nor df analysis.
> > 	(interesting_dest_for_shprep): New function.
> > 	(split_live_ranges_for_shrink_wrap): Likewise.
> > 	(ira): Calculate dominance info and df analysis. Call
> > 	split_live_ranges_for_shrink_wrap.
> > 
> > testsuite/
> > 	* gcc.dg/pr10474.c: New testcase.
> > 	* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
> > 	* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.
> 
> Unfortunately this patch breaks i686-linux bootstrap,

It (revision r204205) "also" causes
<http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58934>.

(If I had seen your message Jakub, it might have saved me the
reghunt and a PR.  On the other hand, debugging an ICE is far
more comfortable than a miscompare.)

brgds, H-P

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
  2013-10-30 23:53           ` Jakub Jelinek
  2013-10-31  7:02             ` ICE with "[PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping" Hans-Peter Nilsson
@ 2013-10-31 14:13             ` Martin Jambor
  1 sibling, 0 replies; 10+ messages in thread
From: Martin Jambor @ 2013-10-31 14:13 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Steven Bosscher, Vladimir Makarov, GCC Patches

On Thu, Oct 31, 2013 at 12:16:41AM +0100, Jakub Jelinek wrote:
> On Fri, Oct 25, 2013 at 05:19:06PM +0200, Martin Jambor wrote:
> > 2013-10-23  Martin Jambor  <mjambor@suse.cz>
> > 
> > 	PR rtl-optimization/10474
> > 	* ira.c (find_moveable_pseudos): Do not calculate dominance info
> > 	nor df analysis.
> > 	(interesting_dest_for_shprep): New function.
> > 	(split_live_ranges_for_shrink_wrap): Likewise.
> > 	(ira): Calculate dominance info and df analysis. Call
> > 	split_live_ranges_for_shrink_wrap.
> > 
> > testsuite/
> > 	* gcc.dg/pr10474.c: New testcase.
> > 	* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
> > 	* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.
> 
> Unfortunately this patch breaks i686-linux bootstrap,

Because of this, PR 58934 and perhaps other problems, and because I
have reasons to doubt that I will be able to resolve them today or
tomorrow (for example seeing postreload in the backraces makes me
think I'll need some help :-), I am about to commit the following to
revert my patch, after it passed C and C++ bootstrap and x86_64-linux.

Sorry for the breakage,

Martin

2013-10-31  Martin Jambor  <mjambor@suse.cz>

	PR rtl-optimization/58934

	Revert:
	2013-10-30  Martin Jambor  <mjambor@suse.cz>
	PR rtl-optimization/10474
	* ira.c (find_moveable_pseudos): Do not calculate dominance info
	nor df analysis.
	(interesting_dest_for_shprep): New function.
	(split_live_ranges_for_shrink_wrap): Likewise.
	(ira): Calculate dominance info and df analysis. Call
	split_live_ranges_for_shrink_wrap.

testsuite/
	* gcc.dg/pr10474.c: New testcase.
	* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
	* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.


diff --git a/gcc/ira.c b/gcc/ira.c
index d959109..1a26fed 100644
--- b/gcc/ira.c
+++ a/gcc/ira.c
@@ -3990,6 +3990,9 @@
   pseudo_replaced_reg.release ();
   pseudo_replaced_reg.safe_grow_cleared (max_regs);
 
+  df_analyze ();
+  calculate_dominance_info (CDI_DOMINATORS);
+
   i = 0;
   bitmap_initialize (&live, 0);
   bitmap_initialize (&used, 0);
@@ -4309,196 +4312,7 @@
   regstat_free_ri ();
   regstat_init_n_sets_and_refs ();
   regstat_compute_ri ();
-}
-
-
-/* If insn is interesting for parameter range-splitting shring-wrapping
-   preparation, i.e. it is a single set from a hard register to a pseudo, which
-   is live at CALL_DOM, return the destination.  Otherwise return NULL.  */
-
-static rtx
-interesting_dest_for_shprep (rtx insn, basic_block call_dom)
-{
-  rtx set = single_set (insn);
-  if (!set)
-    return NULL;
-  rtx src = SET_SRC (set);
-  rtx dest = SET_DEST (set);
-  if (!REG_P (src) || !HARD_REGISTER_P (src)
-      || !REG_P (dest) || HARD_REGISTER_P (dest)
-      || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest))))
-    return NULL;
-  return dest;
-}
-
-/* Split live ranges of pseudos that are loaded from hard registers in the
-   first BB in a BB that dominates all non-sibling call if such a BB can be
-   found and is not in a loop.  Return true if the function has made any
-   changes.  */
-
-static bool
-split_live_ranges_for_shrink_wrap (void)
-{
-  basic_block bb, call_dom = NULL;
-  basic_block first = single_succ (ENTRY_BLOCK_PTR);
-  rtx insn, last_interesting_insn = NULL;
-  bitmap_head need_new, reachable;
-  vec<basic_block> queue;
-
-  if (!flag_shrink_wrap)
-    return false;
-
-  bitmap_initialize (&need_new, 0);
-  bitmap_initialize (&reachable, 0);
-  queue.create (n_basic_blocks);
-
-  FOR_EACH_BB (bb)
-    FOR_BB_INSNS (bb, insn)
-      if (CALL_P (insn) && !SIBLING_CALL_P (insn))
-	{
-	  if (bb == first)
-	    {
-	      bitmap_clear (&need_new);
-	      bitmap_clear (&reachable);
-	      queue.release ();
-	      return false;
-	    }
-
-	  bitmap_set_bit (&need_new, bb->index);
-	  bitmap_set_bit (&reachable, bb->index);
-	  queue.quick_push (bb);
-	  break;
-	}
-
-  if (queue.is_empty ())
-    {
-      bitmap_clear (&need_new);
-      bitmap_clear (&reachable);
-      queue.release ();
-      return false;
-    }
-
-  while (!queue.is_empty ())
-    {
-      edge e;
-      edge_iterator ei;
-
-      bb = queue.pop ();
-      FOR_EACH_EDGE (e, ei, bb->succs)
-	if (e->dest != EXIT_BLOCK_PTR
-	    && bitmap_set_bit (&reachable, e->dest->index))
-	  queue.quick_push (e->dest);
-    }
-  queue.release ();
-
-  FOR_BB_INSNS (first, insn)
-    {
-      rtx dest = interesting_dest_for_shprep (insn, NULL);
-      if (!dest)
-	continue;
-
-      if (DF_REG_DEF_COUNT (REGNO (dest)) > 1)
-	{
-	  bitmap_clear (&need_new);
-	  bitmap_clear (&reachable);
-	  return false;
-	}
-
-      for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
-	   use;
-	   use = DF_REF_NEXT_REG (use))
-	{
-	  if (NONDEBUG_INSN_P (DF_REF_INSN (use))
-	      && GET_CODE (DF_REF_REG (use)) == SUBREG)
-	    {
-	      /* This is necessary to avoid hitting an assert at
-		 postreload.c:2294 in libstc++ testcases on x86_64-linux.  I'm
-		 not really sure what the probblem actually is there.  */
-	      bitmap_clear (&need_new);
-	      bitmap_clear (&reachable);
-	      return false;
-	    }
-
-	  int ubbi = DF_REF_BB (use)->index;
-	  if (bitmap_bit_p (&reachable, ubbi))
-	    bitmap_set_bit (&need_new, ubbi);
-	}
-      last_interesting_insn = insn;
-    }
-
-  bitmap_clear (&reachable);
-  if (!last_interesting_insn)
-    {
-      bitmap_clear (&need_new);
-      return false;
-    }
-
-  call_dom = nearest_common_dominator_for_set (CDI_DOMINATORS, &need_new);
-  bitmap_clear (&need_new);
-  if (call_dom == first)
-    return false;
-
-  loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
-  while (bb_loop_depth (call_dom) > 0)
-    call_dom = get_immediate_dominator (CDI_DOMINATORS, call_dom);
-  loop_optimizer_finalize ();
-
-  if (call_dom == first)
-    return false;
-
-  calculate_dominance_info (CDI_POST_DOMINATORS);
-  if (dominated_by_p (CDI_POST_DOMINATORS, first, call_dom))
-    {
-      free_dominance_info (CDI_POST_DOMINATORS);
-      return false;
-    }
-  free_dominance_info (CDI_POST_DOMINATORS);
-
-  if (dump_file)
-    fprintf (dump_file, "Will split live ranges of parameters at BB %i\n",
-	     call_dom->index);
-
-  bool ret = false;
-  FOR_BB_INSNS (first, insn)
-    {
-      rtx dest = interesting_dest_for_shprep (insn, call_dom);
-      if (!dest)
-	continue;
-
-      rtx newreg = NULL_RTX;
-      df_ref use, next;
-      for (use = DF_REG_USE_CHAIN (REGNO(dest)); use; use = next)
-	{
-	  rtx uin = DF_REF_INSN (use);
-	  next = DF_REF_NEXT_REG (use);
-
-	  basic_block ubb = BLOCK_FOR_INSN (uin);
-	  if (ubb == call_dom
-	      || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
-	    {
-	      if (!newreg)
-		newreg = ira_create_new_reg (dest);
-	      validate_change (uin, DF_REF_LOC (use), newreg, true);
-	    }
-	}
-
-      if (newreg)
-	{
-	  rtx new_move = gen_move_insn (newreg, dest);
-	  emit_insn_after (new_move, bb_note (call_dom));
-	  if (dump_file)
-	    {
-	      fprintf (dump_file, "Split live-range of register ");
-	      print_rtl_single (dump_file, dest);
-	    }
-	  ret = true;
-	}
-
-      if (insn == last_interesting_insn)
-	break;
-    }
-  apply_change_group ();
-  return ret;
+  free_dominance_info (CDI_DOMINATORS);
 }
 
 /* Perform the second half of the transformation started in
@@ -4709,16 +4523,7 @@
      allocation because of -O0 usage or because the function is too
      big.  */
   if (ira_conflicts_p)
-    {
-      df_analyze ();
-      calculate_dominance_info (CDI_DOMINATORS);
-
-      find_moveable_pseudos ();
-      if (split_live_ranges_for_shrink_wrap ())
-	df_analyze ();
-
-      free_dominance_info (CDI_DOMINATORS);
-    }
+    find_moveable_pseudos ();
 
   max_regno_before_ira = max_reg_num ();
   ira_setup_eliminable_regset (true);
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
new file mode 100644
index 0000000..fe497c2
--- b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
+++ /dev/null
@@ -1,31 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue"  } */
-
-int __attribute__((noinline, noclone))
-foo (int a)
-{
-  return a + 5;
-}
-
-static int g;
-
-int __attribute__((noinline, noclone))
-bar (int a)
-{
-  int r;
-
-  if (a)
-    {
-      r = foo (a);
-      g = r + a;
-    }
-  else
-    r = a+1;
-  return r;
-}
-
-/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } } */
-/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
-/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
-/* { dg-final { cleanup-rtl-dump "ira" } } */
-/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
new file mode 100644
index 0000000..872a757
--- b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
+++ /dev/null
@@ -1,36 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue"  } */
-
-int __attribute__((noinline, noclone))
-foo (int a)
-{
-  return a + 5;
-}
-
-static int g;
-
-int __attribute__((noinline, noclone))
-bar (int a)
-{
-  int r;
-
-  if (a)
-    {
-      r = a;
-      while (r < 500)
-	if (r % 2)
-	  r = foo (r);
-	else
-	  r = foo (r+1);
-      g = r + a;
-    }
-  else
-    r = g+1;
-  return r;
-}
-
-/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } } */
-/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
-/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
-/* { dg-final { cleanup-rtl-dump "ira" } } */
-/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */
diff --git a/gcc/testsuite/gcc.dg/pr10474.c b/gcc/testsuite/gcc.dg/pr10474.c
new file mode 100644
index 0000000..ee085c3
--- b/gcc/testsuite/gcc.dg/pr10474.c
+++ /dev/null
@@ -1,16 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O3 -fdump-rtl-pro_and_epilogue"  } */
-
-void f(int *i)
-{
-	if (!i)
-		return;
-	else
-	{
-		__builtin_printf("Hi");
-		*i=0;
-	}
-}
-
-/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  } } */
-/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-10-31 13:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-21 16:46 [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping Martin Jambor
2013-10-22  0:03 ` Steven Bosscher
2013-10-22  6:52   ` Vladimir Makarov
2013-10-23 17:20     ` Martin Jambor
2013-10-23 23:49       ` Steven Bosscher
2013-10-25 15:49         ` Martin Jambor
2013-10-29 16:26           ` Vladimir Makarov
2013-10-30 23:53           ` Jakub Jelinek
2013-10-31  7:02             ` ICE with "[PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping" Hans-Peter Nilsson
2013-10-31 14:13             ` [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping Martin Jambor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).