Interest in integer auto-upcasting pass for normalization and optimization?

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Interest in integer auto-upcasting pass for normalization and optimization?
@ 2009-05-09 20:20 Albert Cohen
  2009-05-09 22:28 ` Richard Guenther
  2009-05-11 10:18 ` Daniel Jacobowitz
  0 siblings, 2 replies; 8+ messages in thread
From: Albert Cohen @ 2009-05-09 20:20 UTC (permalink / raw)
  To: GCC; +Cc: Albert Cohen

Sebastian Pop and I have been discussing the option of designing a new 
pass, based on vrp, to normalize integer types towards a canonical 
supertype typically a machine word, equivalent to signed long, or to 
truncate to a smaller-size word when it makes sense. This would be a 
very simple pass (on top of not-so-simple vrp), but arguably a quite 
regression-prone one as well (due to aliases/escape and common C 
standard violations).

The pass could be parameterized with three different objectives, 
depending on where it is scheduled in the pass manager.

(1) canonicalize to the supertype aggressively, to facilitate the 
application of further passes like autovect which require very precise 
understanding of the type conversions;
(2) compress the types to increase vectorization factor and reduce 
register pressure (assuming the target supports sub-word register 
allocation with register aliases);
(3) optimize the types to minimize the dynamic number of casts that 
result in actual ASM instructions.

Graphite and the vectorizer would clearly benefit from such a pass, at 
least if it implemented objective (1).

I wonder if some of this is already implemented somewhere, or if someone 
played with it in the past, or is interesting in contributing.

Nothing is planned yet on our side, and temporary fixes exist in the 
short term (as far as Graphite and the vectorizer are concerned), but it 
would potentially be of great help.

Feedback welcome,
Albert

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Interest in integer auto-upcasting pass for normalization and   optimization?
  2009-05-09 20:20 Interest in integer auto-upcasting pass for normalization and optimization? Albert Cohen
@ 2009-05-09 22:28 ` Richard Guenther
  2009-05-10  1:39   ` Richard Guenther
  2009-05-11 10:18 ` Daniel Jacobowitz
  1 sibling, 1 reply; 8+ messages in thread
From: Richard Guenther @ 2009-05-09 22:28 UTC (permalink / raw)
  To: Albert Cohen; +Cc: GCC

On Sat, May 9, 2009 at 10:07 PM, Albert Cohen <Albert.Cohen@inria.fr> wrote:
> Sebastian Pop and I have been discussing the option of designing a new pass,
> based on vrp, to normalize integer types towards a canonical supertype
> typically a machine word, equivalent to signed long, or to truncate to a
> smaller-size word when it makes sense. This would be a very simple pass (on
> top of not-so-simple vrp), but arguably a quite regression-prone one as well
> (due to aliases/escape and common C standard violations).
>
> The pass could be parameterized with three different objectives, depending
> on where it is scheduled in the pass manager.
>
> (1) canonicalize to the supertype aggressively, to facilitate the
> application of further passes like autovect which require very precise
> understanding of the type conversions;
> (2) compress the types to increase vectorization factor and reduce register
> pressure (assuming the target supports sub-word register allocation with
> register aliases);
> (3) optimize the types to minimize the dynamic number of casts that result
> in actual ASM instructions.
>
> Graphite and the vectorizer would clearly benefit from such a pass, at least
> if it implemented objective (1).
>
> I wonder if some of this is already implemented somewhere, or if someone
> played with it in the past, or is interesting in contributing.
>
> Nothing is planned yet on our side, and temporary fixes exist in the short
> term (as far as Graphite and the vectorizer are concerned), but it would
> potentially be of great help.

This is certainly one useful transformation based on value-range information.
The choice of a canonical type is of course at least target dependent.

I suppose you want to do this on register variables only?  Did you think about
promoting function arguments and returns as well as part of an IPA pass?

I don't understand how register variable promotion/demotion will help graphite
though - I had the impression graphite can only work on memory.  No?

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Interest in integer auto-upcasting pass for normalization and   optimization?
  2009-05-09 22:28 ` Richard Guenther
@ 2009-05-10  1:39   ` Richard Guenther
  2009-05-11  9:21     ` Albert Cohen
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Guenther @ 2009-05-10  1:39 UTC (permalink / raw)
  To: Albert Cohen; +Cc: GCC

On Sat, May 9, 2009 at 10:42 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Sat, May 9, 2009 at 10:07 PM, Albert Cohen <Albert.Cohen@inria.fr> wrote:
>> Sebastian Pop and I have been discussing the option of designing a new pass,
>> based on vrp, to normalize integer types towards a canonical supertype
>> typically a machine word, equivalent to signed long, or to truncate to a
>> smaller-size word when it makes sense. This would be a very simple pass (on
>> top of not-so-simple vrp), but arguably a quite regression-prone one as well
>> (due to aliases/escape and common C standard violations).
>>
>> The pass could be parameterized with three different objectives, depending
>> on where it is scheduled in the pass manager.
>>
>> (1) canonicalize to the supertype aggressively, to facilitate the
>> application of further passes like autovect which require very precise
>> understanding of the type conversions;
>> (2) compress the types to increase vectorization factor and reduce register
>> pressure (assuming the target supports sub-word register allocation with
>> register aliases);
>> (3) optimize the types to minimize the dynamic number of casts that result
>> in actual ASM instructions.
>>
>> Graphite and the vectorizer would clearly benefit from such a pass, at least
>> if it implemented objective (1).
>>
>> I wonder if some of this is already implemented somewhere, or if someone
>> played with it in the past, or is interesting in contributing.
>>
>> Nothing is planned yet on our side, and temporary fixes exist in the short
>> term (as far as Graphite and the vectorizer are concerned), but it would
>> potentially be of great help.
>
> This is certainly one useful transformation based on value-range information.
> The choice of a canonical type is of course at least target dependent.

This btw. can at least partly replace the SEE (or the missed ZEE) pass.

> I suppose you want to do this on register variables only?  Did you think about
> promoting function arguments and returns as well as part of an IPA pass?
>
> I don't understand how register variable promotion/demotion will help graphite
> though - I had the impression graphite can only work on memory.  No?
>
> Thanks,
> Richard.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Interest in integer auto-upcasting pass for normalization and   optimization?
  2009-05-10  1:39   ` Richard Guenther
@ 2009-05-11  9:21     ` Albert Cohen
  0 siblings, 0 replies; 8+ messages in thread
From: Albert Cohen @ 2009-05-11  9:21 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC

Richard Guenther wrote:
> On Sat, May 9, 2009 at 10:42 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Sat, May 9, 2009 at 10:07 PM, Albert Cohen <Albert.Cohen@inria.fr> wrote:
>>> Sebastian Pop and I have been discussing the option of designing a new pass,
>>> based on vrp, to normalize integer types towards a canonical supertype
>>> typically a machine word, equivalent to signed long, or to truncate to a
>>> smaller-size word when it makes sense. This would be a very simple pass (on
>>> top of not-so-simple vrp), but arguably a quite regression-prone one as well
>>> (due to aliases/escape and common C standard violations).
>>>
>>> The pass could be parameterized with three different objectives, depending
>>> on where it is scheduled in the pass manager.
>>>
>>> (1) canonicalize to the supertype aggressively, to facilitate the
>>> application of further passes like autovect which require very precise
>>> understanding of the type conversions;
>>> (2) compress the types to increase vectorization factor and reduce register
>>> pressure (assuming the target supports sub-word register allocation with
>>> register aliases);
>>> (3) optimize the types to minimize the dynamic number of casts that result
>>> in actual ASM instructions.
>>>
>>> Graphite and the vectorizer would clearly benefit from such a pass, at least
>>> if it implemented objective (1).
>>>
>>> I wonder if some of this is already implemented somewhere, or if someone
>>> played with it in the past, or is interesting in contributing.
>>>
>>> Nothing is planned yet on our side, and temporary fixes exist in the short
>>> term (as far as Graphite and the vectorizer are concerned), but it would
>>> potentially be of great help.
>> This is certainly one useful transformation based on value-range information.
>> The choice of a canonical type is of course at least target dependent.
> 
> This btw. can at least partly replace the SEE (or the missed ZEE) pass.

We'll start by looking at what is done there, thanks. Not sure when we 
will start, though...

>> I suppose you want to do this on register variables only?  Did you think about
>> promoting function arguments and returns as well as part of an IPA pass?
>>
>> I don't understand how register variable promotion/demotion will help graphite
>> though - I had the impression graphite can only work on memory.  No?

It will work at a higher level: graphite generates new loops with brand 
new induction variables, whose types are canonical (cf. your earlier 
canonical type suggestion). No way those canonical types can match 
preexisting induction variables in general. This causes a lot of casts 
in some cases, and confuses the vectorizer, and may even generate nasy 
SE/ZE instructions eventually.

Thank you,
Albert

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Interest in integer auto-upcasting pass for normalization and  optimization?
  2009-05-09 20:20 Interest in integer auto-upcasting pass for normalization and optimization? Albert Cohen
  2009-05-09 22:28 ` Richard Guenther
@ 2009-05-11 10:18 ` Daniel Jacobowitz
  1 sibling, 0 replies; 8+ messages in thread
From: Daniel Jacobowitz @ 2009-05-11 10:18 UTC (permalink / raw)
  To: Albert Cohen; +Cc: GCC

On Sat, May 09, 2009 at 10:07:38PM +0200, Albert Cohen wrote:
> I wonder if some of this is already implemented somewhere, or if someone  
> played with it in the past, or is interesting in contributing.

This sounds a bit similar to Nathan Froyd's recent work:

http://gcc.gnu.org/ml/gcc-patches/2009-04/msg01860.html

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Interest in integer auto-upcasting pass for normalization and  optimization?
  2009-05-15 16:55 ` Bernd Schmidt
@ 2009-05-15 18:05   ` Joern Rennecke
  0 siblings, 0 replies; 8+ messages in thread
From: Joern Rennecke @ 2009-05-15 18:05 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 712 bytes --]

Quoting Bernd Schmidt <bernds_cb1@t-online.de>:
>> 2009-02-24  J"orn Rennecke  <joern.rennecke@arc.com>
>>
>>         * loop-doloop.c (doloop_valid_p): Rename to:
>>         (validize_doloop).  Try to fix up loops with conditons for infinite
>>         looping by enclosing them in an outer loop.
>>         Changed caller.
>>         (add_test): Add new parameter edgep.  Changed caller.
>
> This looks interesting, but unless I read the svn log incorrectly, the
> arc branches have all patches applied in a single commit, which isn't
> exactly useful.  Do you have them broken-out somewhere?

I've appended the sliced & diced diff - not separately tested, though.
Please let me know if there are any problems.

[-- Attachment #2: validize-doloop-patch --]
[-- Type: text/plain, Size: 7703 bytes --]

2009-02-24  J"orn Rennecke  <joern.rennecke@arc.com>

	* loop-doloop.c (doloop_valid_p): Rename to:
	(validize_doloop).  Try to fix up loops with conditons for infinite
	looping by enclosing them in an outer loop.
	Changed caller.
	(add_test): Add new parameter edgep.  Changed caller.

Index: loop-doloop.c
===================================================================
--- loop-doloop.c	(revision 145497)
+++ loop-doloop.c	(revision 145498)
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  
 #include "output.h"
 #include "params.h"
 #include "target.h"
+#include "optabs.h"
 
 /* This module is used to modify loops with a determinable number of
    iterations to use special low-overhead looping instructions.
@@ -194,22 +195,92 @@ doloop_condition_get (rtx doloop_pat)
   return 0;
 }
 
-/* Return nonzero if the loop specified by LOOP is suitable for
-   the use of special low-overhead looping instructions.  DESC
-   describes the number of iterations of the loop.  */
+static bool add_test (rtx cond, edge *e, basic_block dest, edge *);
 
-static bool
-doloop_valid_p (struct loop *loop, struct niter_desc *desc)
+/* Check if the loop specified by LOOP is suitable for
+   the use of special low-overhead looping instructions.
+   If necessary to properly implement infinite loops, this may cause
+   a new enclosing loop to be formed.  Returns the (possible changed)
+   loop structure pointer on success, else NULL.
+   DESC describes the number of iterations of the loop.  */
+
+static struct loop *
+validize_doloop (struct loop *loop, struct niter_desc *desc)
 {
   basic_block *body = get_loop_body (loop), bb;
   rtx insn;
   unsigned i;
   bool result = true;
+  rtx list;
+  edge out_edge;
 
   /* Check for loops that may not terminate under special conditions.  */
   if (!desc->simple_p
       || desc->assumptions
-      || desc->infinite)
+      || (desc->infinite
+	  && (EDGE_COUNT (loop->latch->preds) != 1
+	      || !optimize_loop_for_speed_p (loop))))
+    result = false;
+  if (desc->infinite)
+    {
+      edge e, latch_in;
+      edge_iterator ei;
+      rtx insn;
+
+      /* We want to set out_edge to the edge that is used to exit the loop
+	 if the loop count is exhausted.  For now, only handle the case
+	 of a single exit.  */
+      out_edge = NULL;
+      if (single_pred_p (loop->latch))
+	{
+	  latch_in = single_pred_edge (loop->latch);
+	  FOR_EACH_EDGE (e, ei, latch_in->src->succs)
+	    if (e == latch_in)
+	      ; /* do nothing */
+	    else if (!out_edge)
+	      out_edge = e;
+	    else
+	      result = false;
+	}
+      if (!out_edge)
+	result = false;
+      else if (dump_file)
+	fprintf (dump_file, "Doloop: considering putting infinite loop"
+		 " instructions on edge from %d to %d.\n",
+		 out_edge->src->index, out_edge->dest->index);
+      /* The (non-jump) instructions in the current loop latch shoould be
+	 copied into the new loop latch cf. gcc.c-torture/execute/pr27285.c .
+	 For now, just punt when we see any insns in the latch.  */
+      FOR_BB_INSNS (loop->latch, insn)
+	if (NONJUMP_INSN_P (insn))
+	  {
+	    result = false;
+	    break;
+	  }
+	 
+    }
+  /* check_simple_exit can create conditions that do_compare_and_jump_rtx
+     can't grok.  */
+  for (list = desc->infinite; list; list = XEXP (list, 1))
+    {
+      rtx cond = XEXP (list, 0);
+      enum machine_mode mode;
+
+      if (!BINARY_P (cond))
+	{
+	  result = false;
+	  break;
+	}
+      mode = GET_MODE (XEXP (cond, 0));
+      if (mode == VOIDmode)
+	mode = GET_MODE (XEXP (cond, 1));
+
+      if (GET_MODE_CLASS (mode) == MODE_INT
+	  && !can_compare_p (GET_CODE (cond), mode, ccp_jump)
+	  && !COMPARISON_P (cond))
+	result = false;
+    }
+  if (!result)
     {
       /* There are some cases that would require a special attention.
 	 For example if the comparison is LEU and the comparison value
@@ -261,27 +332,84 @@ doloop_valid_p (struct loop *loop, struc
 	    }
 	}
     }
-  result = true;
+  if (desc->infinite)
+    {
+      basic_block header = loop->header;
+      basic_block latch;
+      struct loop *new_loop;
+
+      gcc_assert (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS));
+	{
+	  edge latch_edge = single_succ_edge (loop->latch);
+	  edge in_edge;
+
+	  gcc_assert (EDGE_COUNT (header->preds) == 2);
+	  gcc_assert (latch_edge->dest == header);
+	  if (dump_file)
+	    fprintf (dump_file,
+		     "Doloop: infinite loop generation: latch %d header %d\n",
+		      loop->latch->index, header->index);
+	  in_edge = EDGE_PRED (header, 0);
+	  if (in_edge == latch_edge)
+	    in_edge = EDGE_PRED (header, 1);
+	  else
+	    gcc_assert (latch_edge == EDGE_PRED (header, 1));
+	  gcc_assert (in_edge != out_edge);
+	  header = split_edge (in_edge);
+	  set_immediate_dominator (CDI_DOMINATORS, loop->header, header);
+	  remove_bb_from_loops (header);
+	  add_bb_to_loop (header, loop);
+#if 0 /* For debugging, insert a marker insn.  */
+	  emit_insn_after (gen_unimp_s (GEN_INT (1)), BB_END (header));
+#endif
+	}
+      for (latch = header, list = desc->infinite; list; list = XEXP (list, 1))
+	{
+	  edge new_latch_edge = out_edge;
+
+	  add_test (XEXP (list, 0), &out_edge, latch, &new_latch_edge);
+	  remove_bb_from_loops (out_edge->src);
+	  add_bb_to_loop (out_edge->src, loop);
+	  if (latch == header)
+	    {
+	      latch = split_edge (new_latch_edge);
+	      remove_bb_from_loops (latch);
+	      add_bb_to_loop (latch, loop);
+#if 0 /* For debugging, insert a marker insn.  */
+	      emit_insn_after (gen_trap_s (GEN_INT (42)), BB_END (latch));
+#endif
+	    }
+	}
+      new_loop = alloc_loop ();
+      new_loop->header = loop->header;
+      new_loop->latch = loop->latch;
+      loop->header = header;
+      loop->latch = latch;
+      add_loop (new_loop, loop);
+      loop = new_loop;
+    }
 
 cleanup:
   free (body);
 
-  return result;
+  return result ? loop : 0;
 }
 
 /* Adds test of COND jumping to DEST on edge *E and set *E to the new fallthru
    edge.  If the condition is always false, do not do anything.  If it is always
    true, redirect E to DEST and return false.  In all other cases, true is
-   returned.  */
+   returned.
+   If EDGEP is non-null, assign the any newly created edge to it.  */
 
 static bool
-add_test (rtx cond, edge *e, basic_block dest)
+add_test (rtx cond, edge *e, basic_block dest, edge *edgep)
 {
   rtx seq, jump, label;
   enum machine_mode mode;
   rtx op0 = XEXP (cond, 0), op1 = XEXP (cond, 1);
   enum rtx_code code = GET_CODE (cond);
   basic_block bb;
+  edge new_edge;
 
   mode = GET_MODE (XEXP (cond, 0));
   if (mode == VOIDmode)
@@ -325,7 +453,9 @@ add_test (rtx cond, edge *e, basic_block
 
   LABEL_NUSES (label)++;
 
-  make_edge (bb, dest, (*e)->flags & ~EDGE_FALLTHRU);
+  new_edge = make_edge (bb, dest, (*e)->flags & ~EDGE_FALLTHRU);
+  if (edgep)
+    *edgep = new_edge;
   return true;
 }
 
@@ -448,7 +578,7 @@ doloop_modify (struct loop *loop, struct
 
       te = single_succ_edge (preheader);
       for (; ass; ass = XEXP (ass, 1))
-	if (!add_test (XEXP (ass, 0), &te, set_zero))
+	if (!add_test (XEXP (ass, 0), &te, set_zero, NULL))
 	  break;
 
       if (ass)
@@ -565,7 +697,8 @@ doloop_optimize (struct loop *loop)
   desc = get_simple_loop_desc (loop);
 
   /* Check that loop is a candidate for a low-overhead looping insn.  */
-  if (!doloop_valid_p (loop, desc))
+  loop = validize_doloop (loop, desc);
+  if (!loop)
     {
       if (dump_file)
 	fprintf (dump_file,

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Interest in integer auto-upcasting pass for normalization and   optimization?
  2009-05-11 12:05 Joern Rennecke
@ 2009-05-15 16:55 ` Bernd Schmidt
  2009-05-15 18:05   ` Joern Rennecke
  0 siblings, 1 reply; 8+ messages in thread
From: Bernd Schmidt @ 2009-05-15 16:55 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: gcc

Joern Rennecke wrote:
> For ARCompact, doloop loops in particular were interesting to
> optimize, which is why I have a patch to loop-doloop.c in
> svn://gcc.gnu.org/svn/gcc/branches/arc-4_4-branch :
> 
> 2009-02-24  J"orn Rennecke  <joern.rennecke@arc.com>
> 
>          * loop-doloop.c (doloop_valid_p): Rename to:
>          (validize_doloop).  Try to fix up loops with conditons for 
> infinite
>          looping by enclosing them in an outer loop.
>          Changed caller.
>          (add_test): Add new parameter edgep.  Changed caller.

This looks interesting, but unless I read the svn log incorrectly, the 
arc branches have all patches applied in a single commit, which isn't 
exactly useful.  Do you have them broken-out somewhere?


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Interest in integer auto-upcasting pass for normalization and  optimization?
@ 2009-05-11 12:05 Joern Rennecke
  2009-05-15 16:55 ` Bernd Schmidt
  0 siblings, 1 reply; 8+ messages in thread
From: Joern Rennecke @ 2009-05-11 12:05 UTC (permalink / raw)
  To: gcc

----- Forwarded message from amylaar@spamcop.net -----
     Date: Mon, 11 May 2009 05:07:52 -0400
     From: Joern Rennecke <amylaar@spamcop.net>
Reply-To: Joern Rennecke <amylaar@spamcop.net>
  Subject: Re: Interest in integer auto-upcasting pass for  
normalization and optimization?
       To: Daniel Jacobowitz <drow@false.org>
       Cc: Albert Cohen <Albert.Cohen@inria.fr>, Richard Guenther  
<richard.guenther@gmail.com>, Nathan Froyd <froydnj@codesourcery.com>

> On Sat, May 09, 2009 at 10:07:38PM +0200, Albert Cohen wrote:
>> I wonder if some of this is already implemented somewhere, or if   
>> someone  played with it in the past, or is interesting in   
>> contributing.

Daniel Jacobowitz:
> This sounds a bit similar to Nathan Froyd's recent work:

Note that this is PR 29944.

For ARCompact, doloop loops in particular were interesting to
optimize, which is why I have a patch to loop-doloop.c in
svn://gcc.gnu.org/svn/gcc/branches/arc-4_4-branch :

2009-02-24  J"orn Rennecke  <joern.rennecke@arc.com>

          * loop-doloop.c (doloop_valid_p): Rename to:
          (validize_doloop).  Try to fix up loops with conditons for infinite
          looping by enclosing them in an outer loop.
          Changed caller.
          (add_test): Add new parameter edgep.  Changed caller.






----- End forwarded message -----

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-05-15 12:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-09 20:20 Interest in integer auto-upcasting pass for normalization and optimization? Albert Cohen
2009-05-09 22:28 ` Richard Guenther
2009-05-10  1:39   ` Richard Guenther
2009-05-11  9:21     ` Albert Cohen
2009-05-11 10:18 ` Daniel Jacobowitz
2009-05-11 12:05 Joern Rennecke
2009-05-15 16:55 ` Bernd Schmidt
2009-05-15 18:05   ` Joern Rennecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).