public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [trans-mem] rewrite transaction lowering
@ 2008-10-18  2:02 Richard Henderson
  2008-10-24  0:12 ` Albert Cohen
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Henderson @ 2008-10-18  2:02 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 949 bytes --]

The lowering of transactions are now much more to my liking.  It's a 5 
step process, described in detail in a block comment at the beginning of 
gtm-low.c.  The important functional changes are (1) we properly commit 
the transaction when we leave the region via an exception, and (2) we 
accurately describe the CFG, and the abnormal edges created by the 
longjmp used to restart or abort the transaction.

Still to do is rtl expansion of the TM_LOAD/STORE nodes, rewriting
creation of the transactional clones as real IPA pass, duplicating the 
transaction region without annotations as a fast-path when the 
transaction gets (re-)started in sequential mode.

What I generate is no longer compatible with TinySTM.  It's closer to 
what I believe the final ABI should look like.  At some point I'll 
either import some existing GPL STM library (TinySTM, unless someone has 
a better suggestion) or start a new one from scratch.  We'll see...


r~

[-- Attachment #2: z.txt --]
[-- Type: text/plain, Size: 53263 bytes --]

        * except.c (struct eh_region): Add ERT_TRANSACTION.
        (gen_eh_region): Allow if flag_tm.
        (gen_eh_region_transaction, get_eh_region_from_number): New.
        (remove_eh_handler): Export.
        (current_function_has_exception_handlers): Handle ERT_TRANSACTION.
        (build_post_landing_pads, reachable_next_level): Likewise.
        (collect_one_action_chain): Likewise.
        (foreach_reachable_transaction): New.
        * except.h: Add new exported decls.
        * gimple-low.c (struct lower_data): Remove in_transaction.
        (lower_tm_atomic, record_vars_into_tm): Remove.
        * gimple-pretty-print.c (dump_gimple_fmt): Add %x.
        (dump_gimple_assign): Handle TM_LOAD/STORE.
        (dump_gimple_tm_atomic): Dump the subcode.
        * gimple.h (GTMA_HAVE_ABORT, GTMA_HAVE_LOAD, GTMA_HAVE_STORE,
        GTMA_HAVE_CALL_TM, GTMA_HAVE_CALL_IRREVOKABLE,
        GTMA_MUST_CALL_IRREVOKABLE, GTMA_HAVE_CALL_INDIRECT): New.
        (gimple_tm_atomic_subcode, gimple_tm_atomic_set_subcode): New.
        * gtm-low.c (struct ltm_state, add_stmt_to_transaction,
        lower_assign_tm, lower_call_tm, remove_tm_commits,
        lower_tm_atomic, lower_sequence_tm, lower_sequence_no_tm): New.
        (execute_lower_tm): Use them.
        (TM_START_RESTORE_LIVE_IN, TM_START_ABORT): New.
        (checkpoint_live_in_variables): Rewrite.
        (checkpoint_tm_txn, checkpoint_tm): Remove.
        (expand_tm_atomic): New.
        (execute_checkpoint_tm): Use it.
        (make_tm_edge_1, make_tm_edge, is_transactional_stmt): New.
        (pass_lower_tm): Rename from pass_expand_tm.
        * passes.c (init_optimization_passes): Run pass_lower_tm
        immediately after pass_lower_eh.  Run pass_checkpoint_tm
        after early optimizations.
        * tree-cfg.c (make_edges): Call make_tm_edge.  Conditionally
        create the __tm_atomic abort edge. 
        (cleanup_dead_labels): Handle GIMPLE_TM_ATOMIC.  Avoid unnecessary
        writes into the statements to update labels.
        (is_ctrl_altering_stmt): Include is_transactional_stmt.
        (verify_stmt): Handle transactional edges.
        * tree-eh.c (collect_finally_tree): Walk GIMPLE_TM_ATOMIC.
        (lower_eh_constructs_2): Create EH regions for them.
        (verify_eh_edges): Handle transactional edges.
        * tree-flow.h (make_tm_edge, is_transactional_stmt): Declare.

        * c-parser.c (c_parser_tm_abort): Call add_stmt.
        * cgraphbuild.c (prepare_tm_clone): Disable for now.

--- c-parser.c	(revision 141162)
+++ c-parser.c	(local)
@@ -8255,7 +8255,7 @@ c_parser_tm_abort (c_parser *parser)
   /* ??? Verify that __tm_abort is contained within the
      lexical scope of a __tm_atomic.  */
 
-  return build_call_expr (built_in_decls[BUILT_IN_TM_ABORT], 0);
+  return add_stmt (build_call_expr (built_in_decls[BUILT_IN_TM_ABORT], 0));
 }
 \f
 /* Parse a single source file.  */
--- cgraphbuild.c	(revision 141162)
+++ cgraphbuild.c	(local)
@@ -125,7 +125,7 @@ compute_call_stmt_bb_frequency (basic_bl
 /* Eagerly clone functions so that TM expansion can create
    and redirect calls to a transactional clone.  */
 
-static void
+static void ATTRIBUTE_UNUSED
 prepare_tm_clone (struct cgraph_node *node)
 {
   struct cgraph_node *tm_node;
@@ -275,7 +275,7 @@ build_cgraph_edges (void)
 
   build_cgraph_edges_from_node (node);
   
-  prepare_tm_clone (node);
+  /* prepare_tm_clone (node); */
 
   return 0;
 }
--- except.c	(revision 141162)
+++ except.c	(local)
@@ -140,7 +140,8 @@ struct eh_region GTY(())
     ERT_CATCH,
     ERT_ALLOWED_EXCEPTIONS,
     ERT_MUST_NOT_THROW,
-    ERT_THROW
+    ERT_THROW,
+    ERT_TRANSACTION
   } type;
 
   /* Holds the action to perform based on the preceding type.  */
@@ -178,6 +179,11 @@ struct eh_region GTY(())
     struct eh_region_u_cleanup {
       struct eh_region *prev_try;
     } GTY ((tag ("ERT_CLEANUP"))) cleanup;
+
+    /* ??? Nothing for now.  */
+    struct eh_region_u_transaction {
+      int dummy;
+    } GTY ((tag ("ERT_TRANSACTION"))) transaction;
   } GTY ((desc ("%0.type"))) u;
 
   /* Entry point for this region's handler before landing pads are built.  */
@@ -253,7 +259,6 @@ static hashval_t ehl_hash (const void *)
 static int ehl_eq (const void *, const void *);
 static void add_ehl_entry (rtx, struct eh_region *);
 static void remove_exception_handler_label (rtx);
-static void remove_eh_handler (struct eh_region *);
 static int for_each_eh_label_1 (void **, void *);
 
 /* The return value of reachable_next_level.  */
@@ -422,7 +427,7 @@ gen_eh_region (enum eh_region_type type,
   struct eh_region *new_eh;
 
 #ifdef ENABLE_CHECKING
-  gcc_assert (doing_eh (0));
+  gcc_assert (flag_tm || doing_eh (0));
 #endif
 
   /* Insert a new blank region as a leaf in the tree.  */
@@ -509,12 +514,24 @@ gen_eh_region_must_not_throw (struct eh_
   return gen_eh_region (ERT_MUST_NOT_THROW, outer);
 }
 
+struct eh_region *
+gen_eh_region_transaction (struct eh_region *outer)
+{
+  return gen_eh_region (ERT_TRANSACTION, outer);
+}
+
 int
 get_eh_region_number (struct eh_region *region)
 {
   return region->region_number;
 }
 
+struct eh_region *
+get_eh_region_from_number (int region_nr)
+{
+  return VEC_index (eh_region, cfun->eh->region_array, region_nr);
+}
+
 bool
 get_eh_region_may_contain_throw (struct eh_region *region)
 {
@@ -808,7 +825,8 @@ current_function_has_exception_handlers 
       region = VEC_index (eh_region, cfun->eh->region_array, i);
       if (region
 	  && region->region_number == i
-	  && region->type != ERT_THROW)
+	  && region->type != ERT_THROW
+	  && region->type != ERT_TRANSACTION)
 	return true;
     }
 
@@ -1473,6 +1491,7 @@ build_post_landing_pads (void)
 
 	case ERT_CATCH:
 	case ERT_THROW:
+	case ERT_TRANSACTION:
 	  /* Nothing to do.  */
 	  break;
 
@@ -2142,7 +2161,7 @@ remove_exception_handler_label (rtx labe
 
 /* Splice REGION from the region tree etc.  */
 
-static void
+void
 remove_eh_handler (struct eh_region *region)
 {
   struct eh_region **pp, **pp_start, *p, *outer, *inner;
@@ -2524,6 +2543,11 @@ reachable_next_level (struct eh_region *
       else
 	return RNL_BLOCKED;
 
+    case ERT_TRANSACTION:
+      /* Transaction regions don't catch exceptions, they catch
+	 transaction restarts.  */
+      return RNL_NOT_CAUGHT;
+
     case ERT_THROW:
     case ERT_UNKNOWN:
       /* Shouldn't see these here.  */
@@ -2538,8 +2562,7 @@ reachable_next_level (struct eh_region *
 
 void
 foreach_reachable_handler (int region_number, bool is_resx,
-			   void (*callback) (struct eh_region *, void *),
-			   void *callback_data)
+			   eh_callback callback, void *callback_data)
 {
   struct reachable_info info;
   struct eh_region *region;
@@ -2581,6 +2604,26 @@ foreach_reachable_handler (int region_nu
     }
 }
 
+/* Invoke CALLBACK for each TRANSACTION region inside REGION_NUMBER.  */
+
+void
+foreach_reachable_transaction (int region_number,
+			       eh_callback callback, void *callback_data)
+{
+  struct eh_region *region;
+
+  region = VEC_index (eh_region, cfun->eh->region_array, region_number);
+  while (region)
+    {
+      if (region->type == ERT_TRANSACTION)
+	{
+	  callback (region, callback_data);
+	  break;
+	}
+      region = region->outer;
+    }
+}
+
 /* Retrieve a list of labels of exception handlers which can be
    reached by a given insn.  */
 
@@ -3177,8 +3220,10 @@ collect_one_action_chain (htab_t ar_hash
 
     case ERT_CATCH:
     case ERT_THROW:
+    case ERT_TRANSACTION:
       /* CATCH regions are handled in TRY above.  THROW regions are
-	 for optimization information only and produce no output.  */
+	 for optimization information only and produce no output.
+	 TRANSACTION regions produce no output as well.  */
       return collect_one_action_chain (ar_hash, region->outer);
 
     default:
--- except.h	(revision 141162)
+++ except.h	(local)
@@ -92,14 +92,16 @@ extern struct eh_region *gen_eh_region_t
 extern struct eh_region *gen_eh_region_catch (struct eh_region *, tree);
 extern struct eh_region *gen_eh_region_allowed (struct eh_region *, tree);
 extern struct eh_region *gen_eh_region_must_not_throw (struct eh_region *);
+extern struct eh_region *gen_eh_region_transaction (struct eh_region *);
 extern int get_eh_region_number (struct eh_region *);
+extern struct eh_region *get_eh_region_from_number (int);
 extern bool get_eh_region_may_contain_throw (struct eh_region *);
 extern tree get_eh_region_tree_label (struct eh_region *);
 extern void set_eh_region_tree_label (struct eh_region *, tree);
 
-extern void foreach_reachable_handler (int, bool,
-				       void (*) (struct eh_region *, void *),
-				       void *);
+typedef void (*eh_callback) (struct eh_region *, void *);
+extern void foreach_reachable_handler (int, bool, eh_callback, void *);
+extern void foreach_reachable_transaction (int, eh_callback, void *);
 
 extern void collect_eh_region_array (void);
 extern void expand_resx_expr (tree);
@@ -107,6 +109,7 @@ extern void verify_eh_tree (struct funct
 extern void dump_eh_tree (FILE *, struct function *);
 extern bool eh_region_outer_p (struct function *, int, int);
 extern int eh_region_outermost (struct function *, int, int);
+extern void remove_eh_handler (struct eh_region *);
 
 /* If non-NULL, this is a function that returns an expression to be
    executed if an unhandled exception is propagated out of a cleanup
--- gimple-low.c	(revision 141162)
+++ gimple-low.c	(local)
@@ -77,16 +77,12 @@ struct lower_data
 
   /* True if the function calls __builtin_setjmp.  */
   bool calls_builtin_setjmp;
-
-  /* True if we're lowering inside a transaction.  */
-  bool in_transaction;
 };
 
 static void lower_stmt (gimple_stmt_iterator *, struct lower_data *);
 static void lower_gimple_bind (gimple_stmt_iterator *, struct lower_data *);
 static void lower_gimple_return (gimple_stmt_iterator *, struct lower_data *);
 static void lower_builtin_setjmp (gimple_stmt_iterator *);
-static void record_vars_into_tm (tree, tree, bool);
 
 
 /* Lower the body of current_function_decl from High GIMPLE into Low
@@ -236,31 +232,6 @@ lower_sequence (gimple_seq seq, struct l
 }
 
 
-/* Lower the __tm_atomic statement pointed by TSI.  DATA is
-   passed through the recursion.  */
-
-static void
-lower_tm_atomic (gimple_stmt_iterator *gsi, struct lower_data *data)
-{
-  bool old_in_transaction = data->in_transaction;
-  gimple stmt = gsi_stmt (*gsi);
-  tree label = create_artificial_label ();
-
-  data->in_transaction = true;
-
-  lower_sequence (gimple_seq_body (stmt), data);
-
-  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
-  gsi_insert_seq_before (gsi, gimple_seq_body (stmt), GSI_SAME_STMT);
-  gsi_insert_before (gsi, gimple_build_label (label), GSI_SAME_STMT);
-
-  gimple_seq_set_body (stmt, NULL);
-  gimple_tm_atomic_set_label (stmt, label);
-  gsi_remove (gsi, false);
-
-  data->in_transaction = old_in_transaction;
-}
-
 /* Lower the OpenMP directive statement pointed by GSI.  DATA is
    passed through the recursion.  */
 
@@ -358,8 +329,8 @@ lower_stmt (gimple_stmt_iterator *gsi, s
       return;
 
     case GIMPLE_TM_ATOMIC:
-      lower_tm_atomic (gsi, data);
-      return;
+      lower_sequence (gimple_seq_body (stmt), data);
+      break;
 
     default:
       gcc_unreachable ();
@@ -405,8 +376,7 @@ lower_gimple_bind (gimple_stmt_iterator 
 	}
     }
 
-  record_vars_into_tm (gimple_bind_vars (stmt), current_function_decl,
-		       data->in_transaction);
+  record_vars (gimple_bind_vars (stmt));
   lower_sequence (gimple_bind_body (stmt), data);
 
   if (new_block)
@@ -820,11 +790,10 @@ lower_builtin_setjmp (gimple_stmt_iterat
 }
 \f
 
-/* Record the variables in VARS into function FN.  If IN_TRANSACTION is
-   true, mark them DECL_IS_TM_PURE_VAR.  */
+/* Record the variables in VARS into function FN.  */
 
-static void
-record_vars_into_tm (tree vars, tree fn, bool in_transaction)
+void
+record_vars_into (tree vars, tree fn)
 {
   if (fn != current_function_decl)
     push_cfun (DECL_STRUCT_FUNCTION (fn));
@@ -843,36 +812,21 @@ record_vars_into_tm (tree vars, tree fn,
 	continue;
 
       /* Record the variable.  */
-      cfun->local_decls = tree_cons (NULL_TREE, var,
-					     cfun->local_decls);
-
-      /* If we're inside a transaction, mark it for NOT checkpointing.  */
-      if (in_transaction)
-	DECL_IS_TM_PURE_VAR (var) = 1;
+      cfun->local_decls = tree_cons (NULL_TREE, var, cfun->local_decls);
     }
 
   if (fn != current_function_decl)
     pop_cfun ();
 }
 
-/* Record the variables in VARS into function FN.  */
-
-void
-record_vars_into (tree vars, tree fn)
-{
-  record_vars_into_tm (vars, fn, false);
-}
-
-
 /* Record the variables in VARS into current_function_decl.  */
 
 void
 record_vars (tree vars)
 {
-  record_vars_into_tm (vars, current_function_decl, false);
+  record_vars_into (vars, current_function_decl);
 }
 
-
 /* Mark BLOCK used if it has a used variable in it, then recurse over its
    subblocks.  */
 
--- gimple-pretty-print.c	(revision 141162)
+++ gimple-pretty-print.c	(local)
@@ -161,6 +161,7 @@ debug_gimple_seq (gimple_seq seq)
      'd' - outputs an int as a decimal,
      's' - outputs a string,
      'n' - outputs a newline,
+     'x' - outputs an int as hexadecimal,
      '+' - increases indent by 2 then outputs a newline,
      '-' - decreases indent by 2 then outputs a newline.   */
 
@@ -215,6 +216,10 @@ dump_gimple_fmt (pretty_printer *buffer,
                 newline_and_indent (buffer, spc);
                 break;
 
+              case 'x':
+                pp_scalar (buffer, "%x", va_arg (args, int));
+                break;
+
               case '+':
                 spc += 2;
                 newline_and_indent (buffer, spc);
@@ -350,9 +355,19 @@ dump_binary_rhs (pretty_printer *buffer,
 static void
 dump_gimple_assign (pretty_printer *buffer, gimple gs, int spc, int flags)
 {
+  enum tree_code code;
+
+  /* Don't bypass the transactional markers like
+     gimple_assign_rhs_code would.  */
+  code = gimple_expr_code (gs);
+  if (code != TM_LOAD && code != TM_STORE
+      && get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS)
+    code = TREE_CODE (gimple_assign_rhs1 (gs));
+
   if (flags & TDF_RAW)
     {
       tree last;
+
       if (gimple_num_ops (gs) == 2)
         last = NULL_TREE;
       else if (gimple_num_ops (gs) == 3)
@@ -361,8 +376,8 @@ dump_gimple_assign (pretty_printer *buff
         gcc_unreachable ();
 
       dump_gimple_fmt (buffer, spc, flags, "%G <%s, %T, %T, %T>", gs,
-                       tree_code_name[gimple_assign_rhs_code (gs)],
-                       gimple_assign_lhs (gs), gimple_assign_rhs1 (gs), last);
+                       tree_code_name[code], gimple_assign_lhs (gs),
+		       gimple_assign_rhs1 (gs), last);
     }
   else
     {
@@ -372,6 +387,11 @@ dump_gimple_assign (pretty_printer *buff
 	  pp_space (buffer);
 	  pp_character (buffer, '=');
 
+	  if (code == TM_LOAD)
+	    pp_string (buffer, "{tm_load}");
+	  else if (code == TM_STORE)
+	    pp_string (buffer, "{tm_store}");
+
 	  if (gimple_assign_nontemporal_move_p (gs))
 	    pp_string (buffer, "{nt}");
 
@@ -1013,11 +1033,18 @@ static void
 dump_gimple_tm_atomic (pretty_printer *buffer, gimple gs, int spc, int flags)
 {
   if (flags & TDF_RAW)
-    dump_gimple_fmt (buffer, spc, flags, "%G <%+BODY <%S> >", gs,
-		     gimple_seq_body (gs));
+    dump_gimple_fmt (buffer, spc, flags,
+		     "%G [SUBCODE=%x,LABEL=%T] <%+BODY <%S> >",
+		     gs, gimple_tm_atomic_subcode (gs),
+		     gimple_tm_atomic_label (gs), gimple_seq_body (gs));
   else
     {
-      pp_string (buffer, "__tm_atomic");
+      pp_printf (buffer, "__tm_atomic [SUBCODE=%x,LABEL=",
+		 gimple_tm_atomic_subcode (gs));
+      dump_generic_node (buffer, gimple_tm_atomic_label (gs),
+			 spc, flags, false);
+      pp_string (buffer, "]");
+
       if (!gimple_seq_empty_p (gimple_seq_body (gs)))
 	{
 	  newline_and_indent (buffer, spc + 2);
--- gimple.h	(revision 141162)
+++ gimple.h	(local)
@@ -735,6 +735,15 @@ struct gimple_statement_omp_atomic_store
 
 /* GIMPLE_TM_ATOMIC.  */
 
+/* Bits to be stored in the GIMPLE_TM_ATOMIC subcode.  */
+#define GTMA_HAVE_ABORT			(1u << 0)
+#define GTMA_HAVE_LOAD			(1u << 1)
+#define GTMA_HAVE_STORE			(1u << 2)
+#define GTMA_HAVE_CALL_TM		(1u << 3)
+#define GTMA_HAVE_CALL_IRREVOKABLE	(1u << 4)
+#define GTMA_MUST_CALL_IRREVOKABLE	(1u << 5)
+#define GTMA_HAVE_CALL_INDIRECT		(1u << 6)
+
 struct gimple_statement_tm_atomic GTY(())
 {
   /* [ WORD 1-5 ]  */
@@ -4134,6 +4143,15 @@ gimple_tm_atomic_label (const_gimple gs)
   return gs->gimple_tm_atomic.label;
 }
 
+/* Return the subcode associated with a GIMPLE_TM_ATOMIC.  */
+
+static inline unsigned int
+gimple_tm_atomic_subcode (const_gimple gs)
+{
+  GIMPLE_CHECK (gs, GIMPLE_TM_ATOMIC);
+  return gs->gsbase.subcode;
+}
+
 /* Set the label associated with a GIMPLE_TM_ATOMIC.  */
 
 static inline void
@@ -4143,6 +4161,16 @@ gimple_tm_atomic_set_label (gimple gs, t
   gs->gimple_tm_atomic.label = label;
 }
 
+/* Set the subcode associated with a GIMPLE_TM_ATOMIC.  */
+
+static inline void
+gimple_tm_atomic_set_subcode (gimple gs, unsigned int subcode)
+{
+  GIMPLE_CHECK (gs, GIMPLE_TM_ATOMIC);
+  gs->gsbase.subcode = subcode;
+}
+
+
 /* Return a pointer to the return value for GIMPLE_RETURN GS.  */
 
 static inline tree *
--- gtm-low.c	(revision 141162)
+++ gtm-low.c	(local)
@@ -1,6 +1,6 @@
 /* Lowering pass for transactional memory directives.
    Converts markers of transactions into explicit calls to
-   the STM runtime library.
+   the TM runtime library.
 
    Copyright (C) 2008 Free Software Foundation, Inc.
 
@@ -18,34 +18,144 @@
 
    You should have received a copy of the GNU General Public License
    along with GCC; see the file COPYING3.  If not see
-   <http://www.gnu.org/licenses/>.
-
-*/
+   <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
 #include "tm.h"
 #include "tree.h"
-#include "rtl.h"
 #include "gimple.h"
-#include "langhooks.h"
-#include "diagnostic.h"
 #include "tree-flow.h"
-#include "timevar.h"
-#include "flags.h"
-#include "function.h"
-#include "expr.h"
-#include "toplev.h"
 #include "tree-pass.h"
-#include "ggc.h"
 #include "except.h"
-#include "splay-tree.h"
-#include "optabs.h"
-#include "cfgloop.h"
-#include "tree-ssa-live.h"
-#include "tree-flow.h"
+#include "diagnostic.h"
+
+
+/* The representation of a transaction changes several times during the
+   lowering process.  In the beginning, in the front-end we have the
+   GENERIC tree TM_ATOMIC.  For example,
+
+	__tm_atomic {
+	  local++;
+	  if (++global == 10)
+	    __tm_abort;
+	}
+
+  is represented as
+
+	TM_ATOMIC {
+	  local++;
+	  if (++global == 10)
+	    __builtin___tm_abort ();
+	}
+
+  During initial gimplification (gimplify.c) the TM_ATOMIC node is
+  trivially replaced with a GIMPLE_TM_ATOMIC node, and we add bits
+  to handle EH cleanup of the transaction:
+
+	GIMPLE_TM_ATOMIC [label=NULL] {
+	  try {
+	    local = local + 1;
+	    t0 [tm_load]= global;
+	    t1 = t0 + 1;
+	    global [tm_store]= t1;
+	    if (t1 == 10)
+	      __builtin___tm_abort ();
+	  } finally {
+	    __builtin___tm_commit ();
+	  }
+	}
+
+  During pass_lower_eh, we create EH regions for the transactions,
+  intermixed with the regular EH stuff.  This gives us a nice persistent
+  mapping (all the way through rtl) from transactional memory operation
+  back to the transaction, which allows us to get the abnormal edges
+  correct to model transaction aborts and restarts.
+
+  During pass_lower_tm, we mark the gimple statements that perform
+  transactional memory operations with TM_LOAD/TM_STORE, and swap out
+  function calls with their (non-)transactional clones.  At this time
+  we flatten nested transactions (when possible), and flatten the
+  GIMPLE representation.
+
+	GIMPLE_TM_ATOMIC [label=over]
+	eh_label:
+	local = local + 1;
+	t0 [tm_load]= global;
+	t1 = t0 + 1;
+	global [tm_store]= t1;
+	if (t1 == 10)
+	  __builtin___tm_abort ();
+	__builtin___tm_commit ();
+	over:
+
+  During pass_checkpoint_tm, we complete the lowering of the
+  GIMPLE_TM_ATOMIC node.  Here we examine the SSA web and arange for
+  local variables to be saved and restored in the event of an abort.
+
+	save_local = local;
+	x = __builtin___tm_start (MAY_ABORT);
+	eh_label:
+	if (x & restore_locals) {
+	  local = save_local;
+	}
+	if (x & abort_transaction)
+	  goto over;
+	local = local + 1;
+	t0 [tm_load]= global;
+	t1 = t0 + 1;
+	global [tm_store]= t1;
+	if (t1 == 10)
+	  __builtin___tm_abort ();
+	__builtin___tm_commit ();
+	over:
+
+  During expansion to rtl, we expand the TM_LOAD/TM_STORE markings
+  with calls to the appropriate builtin functions.  Delaying this long
+  allows the tree optimizers the most visibility into the operations.  */
+
+DEF_VEC_O(gimple_stmt_iterator);
+DEF_VEC_ALLOC_O(gimple_stmt_iterator,heap);
+
+struct ltm_state
+{
+  /* Bits to be stored in the GIMPLE_TM_ATOMIC subcode.  */
+  unsigned subcode;
+
+  /* The EH region number for this transaction.  Non-negative numbers
+     represent an active transaction within this function; -1 represents
+     an active transaction from a calling function (i.e. we're compiling
+     a transaction clone).  For no active transaction, the state pointer
+     passed will be null.  */
+  int region_nr;
+
+  /* Record the iterator pointing to a __tm_commit function call that
+     binds to this transaction region.  There may be many such calls,
+     depending on how the EH expansion of the try-finally node went.
+     But there's usually exactly one such call, and essentially always
+     only a small number, so to avoid rescanning the entire sequence
+     when we need to remove these calls, record the iterator location.  */
+  VEC(gimple_stmt_iterator,heap) *commit_stmts;
+};
+
+
+static void lower_sequence_tm (struct ltm_state *, gimple_seq);
+static void lower_sequence_no_tm (gimple_seq);
+
 
+/* Record the transaction for this statement.  If the statement
+   already has a region number that's fine -- it means that the
+   statement can also throw.  If there's no region number, it 
+   means we're expanding a transactional clone and the region
+   is in a calling function.  */
+
+static void
+add_stmt_to_transaction (struct ltm_state *state, gimple stmt)
+{
+  if (state->region_nr >= 0 && lookup_stmt_eh_region (stmt) < 0)
+    add_stmt_to_eh_region (stmt, state->region_nr);
+}
 
 /* Determine whether X has to be instrumented using a read
    or write barrier.  */
@@ -78,31 +188,47 @@ requires_barrier (tree x)
     }
 }
 
-/* Subsituting a MODIFY_STMT with calls to the STM runtime.  */
+/* Mark the GIMPLE_ASSIGN statement as appropriate for being inside
+   a transaction region.  */
 
 static void
-maybe_transactify_assign (gimple stmt)
+lower_assign_tm (struct ltm_state *state, gimple_stmt_iterator *gsi)
 {
+  gimple stmt = gsi_stmt (*gsi);
   bool load_p = requires_barrier (gimple_assign_rhs1 (stmt));
   bool store_p = requires_barrier (gimple_assign_lhs (stmt));
 
-  if (load_p)
+  if (load_p && store_p)
+    {
+      /* ??? This is a copy between two aggregates in memory.  I
+	 believe the Intel compiler handles this with a special
+	 version of memcpy.  For now, just consider the transaction
+	 irrevokable at this point.  */
+      state->subcode |= GTMA_HAVE_CALL_IRREVOKABLE;
+      return;
+    }
+  else if (load_p)
     {
-      gcc_assert (!store_p);
       gimple_assign_set_rhs_code (stmt, TM_LOAD);
+      state->subcode |= GTMA_HAVE_LOAD;
     }
   else if (store_p)
-    gimple_assign_set_rhs_code (stmt, TM_STORE);
+    {
+      gimple_assign_set_rhs_code (stmt, TM_STORE);
+      state->subcode |= GTMA_HAVE_STORE;
+    }
+  else
+    return;
+
+  add_stmt_to_transaction (state, stmt);
 }
 
-/* Helper function that replaces call expressions inside
-   transactions and issues a warning if no transactional
-   clone is found. */
+/* Mark a GIMPLE_CALL as appropriate for being inside a transaction.  */
 
 static void
-maybe_transactify_call (gimple stmt)
+lower_call_tm (struct ltm_state *state, gimple_stmt_iterator *gsi)
 {
-  bool redirected = false;
+  gimple stmt = gsi_stmt (*gsi);
   tree fn_decl;
   struct cgraph_node *node, *orig_node;
   int flags;
@@ -114,9 +240,28 @@ maybe_transactify_call (gimple stmt)
   fn_decl = gimple_call_fndecl (stmt);
   if (!fn_decl)
     {
-      warning (0, "Indirect call potentially breaks isolation of transactions");
+      state->subcode |= GTMA_HAVE_CALL_INDIRECT;
       return;
     }
+
+  /* Check if this call is one of our transactional builtins.  */
+  if (DECL_BUILT_IN_CLASS (fn_decl) == BUILT_IN_NORMAL)
+    switch (DECL_FUNCTION_CODE (fn_decl))
+      {
+      case BUILT_IN_TM_COMMIT:
+	/* Remember the commit so that we can remove it if
+	   we decide to elide the transaction.  */
+	VEC_safe_push (gimple_stmt_iterator,heap, state->commit_stmts, gsi);
+	return;
+      case BUILT_IN_TM_ABORT:
+	state->subcode |= GTMA_HAVE_ABORT;
+	add_stmt_to_transaction (state, stmt);
+	return;
+
+      default:
+	break;
+      }
+
   if (DECL_IS_TM_PURE (fn_decl))
     return;
 
@@ -129,12 +274,11 @@ maybe_transactify_call (gimple stmt)
       if (DECL_IS_TM_CLONE (node->decl))
 	break;
     }
-
   if (DECL_IS_TM_CLONE (node->decl))
     {
       struct cgraph_edge *callers = orig_node->callers;
 
-      /* Find appropriate call stmt to redirect */
+      /* Find appropriate call stmt to redirect.  */
       while (callers)
 	{
 	  if (callers->call_stmt == stmt)
@@ -142,68 +286,192 @@ maybe_transactify_call (gimple stmt)
 	  callers = callers->next_caller;
 	}
 
-      /* Substitute call stmt. */
+      /* Substitute call stmt.  */
       if (callers)
 	{
 	  gimple_call_set_fndecl (stmt, node->decl);
 	  cgraph_redirect_edge_callee (callers, node);
 	  if (dump_file)
-	    fprintf (dump_file, "redirected edge to %s\n",
-		     get_name (node->decl));
-	  redirected = true;
+	    {
+	      fprintf (dump_file, "redirected edge to");
+	      print_generic_expr (dump_file, node->decl, 0);
+	      fputc ('\n', dump_file);
+	    }
+
+	  state->subcode |= GTMA_HAVE_CALL_TM;
+	  add_stmt_to_transaction (state, stmt);
+	  return;
 	}
     }
 
-  /* In case the function call was not redirected and the function
-     not marked as const or tm_pure, issue a warning. */
-  /* ??? Handling of calls to irrevocable functions can be expanded here. */
-  if (!redirected)
-    warning (0, "Call to %qD potentially breaks isolation of transactions.",
-	     fn_decl);
+  /* The function was not const, tm_pure, or redirected to a 
+     transactional clone.  The call is therefore considered to
+     be irrevokable.  */
+  state->subcode |= GTMA_HAVE_CALL_IRREVOKABLE;
+}
+
+/* Remove any calls to __tm_commit inside STATE which belong
+   to the transaction.  */
+
+static void
+remove_tm_commits (struct ltm_state *state)
+{
+  gimple_stmt_iterator *gsi;
+  unsigned i;
+
+  for (i = 0;
+       VEC_iterate(gimple_stmt_iterator, state->commit_stmts, i, gsi);
+       ++i)
+    gsi_remove (gsi, true);
 }
 
-/* This function expands the stmts within a transaction so that
-   the corresponding STM versions of the stmt is called. */
+/* Lower a GIMPLE_TM_ATOMIC statement.  The GSI is advanced.  */
 
-static void ATTRIBUTE_UNUSED
-transactify_stmt (gimple_stmt_iterator *gsi)
+static void
+lower_tm_atomic (struct ltm_state *outer_state, gimple_stmt_iterator *gsi)
 {
   gimple stmt = gsi_stmt (*gsi);
+  struct ltm_state this_state;
+  struct eh_region *eh_region;
+  tree label;
+
+  this_state.subcode = 0;
+  this_state.region_nr = lookup_stmt_eh_region (stmt);
+  this_state.commit_stmts = VEC_alloc(gimple_stmt_iterator, heap, 1);
+
+  gcc_assert (this_state.region_nr >= 0);
+  eh_region = get_eh_region_from_number (this_state.region_nr);
+
+  /* First, lower the body.  The scanning that we do inside gives
+     us some idea of what we're dealing with.  */
+  lower_sequence_tm (&this_state, gimple_seq_body (stmt));
+
+  /* If there was absolutely nothing transaction related inside the
+     transaction, we may elide it.  Likewise if this is a nested
+     transaction and does not contain an abort.  */
+  if (this_state.subcode == 0
+      || (!(this_state.subcode & GTMA_HAVE_ABORT)
+	  && outer_state != NULL))
+    {
+      if (outer_state)
+	outer_state->subcode |= this_state.subcode;
 
-  switch (gimple_code (stmt))
+      remove_tm_commits (&this_state);
+      gsi_insert_seq_before (gsi, gimple_seq_body (stmt), GSI_SAME_STMT);
+      gimple_seq_set_body (stmt, NULL);
+      gsi_remove (gsi, true);
+      remove_eh_handler (eh_region);
+      goto fini;
+    }
+
+  /* Insert an EH_LABEL immediately after the GIMPLE_TM_ATOMIC node.
+     This label won't really be used, but it mimicks the effect of 
+     the setjmp/longjmp that's going on behind the scenes.  */
+  label = create_artificial_label ();
+  set_eh_region_tree_label (eh_region, label);
+  gsi_insert_after (gsi, gimple_build_label (label), GSI_CONTINUE_LINKING);
+
+  /* Insert the entire transaction sequence.  */
+  gsi_insert_seq_after (gsi, gimple_seq_body (stmt), GSI_CONTINUE_LINKING);
+  gimple_seq_set_body (stmt, NULL);
+
+  /* Record a label at the end of the transaction that will be used in
+     case the transaction aborts.  */
+  if (this_state.subcode & GTMA_HAVE_ABORT)
     {
-    case GIMPLE_CALL:
-      maybe_transactify_call (stmt);
-      break;
+      label = create_artificial_label ();
+      gimple_tm_atomic_set_label (stmt, label);
+      gsi_insert_after (gsi, gimple_build_label (label), GSI_CONTINUE_LINKING);
+    }
 
-    case GIMPLE_ASSIGN:
-      /* Only memory reads/writes need to be instrumented.  */
-      if (gimple_assign_single_p (stmt))
-	maybe_transactify_assign (stmt);
-      break;
+  /* Record the set of operations found for use during final lowering
+     of the GIMPLE_TM_ATOMIC node.  */
+  gimple_tm_atomic_set_subcode (stmt, this_state.subcode);
 
-    default:
-      break;
+  /* Always update the iterator.  */
+  gsi_next (gsi);
+
+ fini:
+  VEC_free (gimple_stmt_iterator, heap, this_state.commit_stmts);
+}
+
+/* Iterate through the statements in the sequence, lowering them all
+   as appropriate for being in a transaction.  */
+
+static void
+lower_sequence_tm (struct ltm_state *state, gimple_seq seq)
+{
+  gimple_stmt_iterator gsi;
+
+  for (gsi = gsi_start (seq); !gsi_end_p (gsi); )
+    {
+      gimple stmt = gsi_stmt (gsi);
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_ASSIGN:
+	  /* Only memory reads/writes need to be instrumented.  */
+	  if (gimple_assign_single_p (stmt))
+	    lower_assign_tm (state, &gsi);
+	  break;
+
+	case GIMPLE_CALL:
+	  lower_call_tm (state, &gsi);
+	  break;
+
+	case GIMPLE_TM_ATOMIC:
+	  lower_tm_atomic (state, &gsi);
+	  goto no_update;
+
+	default:
+	  break;
+	}
+      gsi_next (&gsi);
+    no_update:
+      ;
     }
 }
 
-/* Main entry point for expanding TM-GIMPLE into runtime calls to the STM. */
+/* Iterate through the statements in the sequence, lowering them all
+   as appropriate for being outside of a transaction.  */
 
-static unsigned int
-execute_expand_tm (void)
+static void
+lower_sequence_no_tm (gimple_seq seq)
 {
-  bool in_transaction = false;
+  gimple_stmt_iterator gsi;
+
+  for (gsi = gsi_start (seq); !gsi_end_p (gsi); )
+    if (gimple_code (gsi_stmt (gsi)) == GIMPLE_TM_ATOMIC)
+      lower_tm_atomic (NULL, &gsi);
+    else
+      gsi_next (&gsi);
+}
+
+/* Main entry point for flattening GIMPLE_TM_ATOMIC constructs.  After
+   this, GIMPLE_TM_ATOMIC nodes still exist, but the nested body has
+   been moved out, and all the data required for constructing a proper
+   CFG has been recorded.  */
 
+static unsigned int
+execute_lower_tm (void)
+{
   /* Functions that are marked TM_PURE don't need annotation by definition.  */
+  /* ??? The Intel OOPSLA paper talks about performing the same scan of the
+     function as we would if the function was marked DECL_IS_TM_CLONE, and
+     warning if we find anything for which we would have made a change.  */
   if (DECL_IS_TM_PURE (current_function_decl))
     return 0;
 
   /* When instrumenting a transactional clone, we begin the function inside
      a transaction.  */
   if (DECL_IS_TM_CLONE (current_function_decl))
-    in_transaction = true;
-
-  /* Walk dominator tree expanding blocks.  Seed with in_transaction.  */
+    {
+      struct ltm_state state;
+      state.subcode = 0;
+      state.region_nr = -1;
+      lower_sequence_tm (&state, gimple_body (current_function_decl));
+    }
+  else
+    lower_sequence_no_tm (gimple_body (current_function_decl));
 
   return 0;
 }
@@ -216,227 +484,162 @@ gate_tm (void)
   return flag_tm;
 }
 
-struct gimple_opt_pass pass_expand_tm =
+struct gimple_opt_pass pass_lower_tm =
 {
  {
   GIMPLE_PASS,
-  "tmexp",				/* name */
+  "tmlower",				/* name */
   gate_tm,				/* gate */
-  execute_expand_tm,			/* execute */
+  execute_lower_tm,			/* execute */
   NULL,					/* sub */
   NULL,					/* next */
   0,					/* static_pass_number */
   0,					/* tv_id */
-  PROP_gimple_any,			/* properties_required */
+  PROP_gimple_leh,			/* properties_required */
   0,			                /* properties_provided */
   0,					/* properties_destroyed */
   0,					/* todo_flags_start */
-  TODO_dump_func
-  | TODO_cleanup_cfg
-  | TODO_ggc_collect,		        /* todo_flags_finish */
+  TODO_dump_func		        /* todo_flags_finish */
  }
 };
+\f
 
-
-#if 0
-/* Calculate live ranges on SSA. Then checkpoint the
-   live-in variables to the transaction. */
+/* ??? Find real values for these bits.  */
+#define TM_START_RESTORE_LIVE_IN	1
+#define TM_START_ABORT			2
+
+/* All local variables that have been modified since the beginning of the
+   transaction up until the last possible transaction restart need to be
+   reset when we restart the transaction.
+
+   Here we implement this by replacing the new SSA_NAME created by the
+   PHI at the join of the abnormal backedges by the old SSA_NAME that
+   was originally live at entry to the transaction.  This does result in
+   the old SSA_NAME continuing to stay live when new values are defined,
+   and this isn't necessarily the most efficient implementation, but it
+   is just about the easiest.  */
 
 static void
-checkpoint_live_in_variables (struct tm_region *region,
-			      gimple_stmt_iterator *gsi_recover,
-			      basic_block begin_bb)
-{
-  int index = begin_bb->index;
-  block_stmt_iterator bsi_save = bsi_for_stmt (region->setjmp_stmt);
-  basic_block save_bb = bb_for_stmt (region->setjmp_stmt);
-  basic_block recover_bb = bb_for_stmt (bsi_stmt (*bsi_recover));
-  tree ssa_var;
-  tree_live_info_p liveinfo;
-  var_map map;
-  int p;
-  tree rep;
-  unsigned int i;
-  unsigned int j;
-  bitmap_iterator bi;
-
-  map = init_var_map (num_ssa_names + 1);
-
-  /* Create liveness information for each SSA_NAME. */
-  for (j = 0; j < num_ssa_names; j++)
-    {
-      ssa_var = ssa_name (j);
-      if (!ssa_var)
-	continue;
-
-      if (TREE_CODE (ssa_var) == SSA_NAME)
-	{
-	  register_ssa_partition (map, ssa_var);
-	  p = partition_find (map->var_partition, SSA_NAME_VERSION (ssa_var));
-	  gcc_assert (p != NO_PARTITION);
-	  rep = partition_to_var (map, p);
-	}
-    }
-
-  liveinfo = calculate_live_ranges (map);
+checkpoint_live_in_variables (edge e)
+{
+  gimple_stmt_iterator gsi;
 
-  /* If variable is live-in at beginning of the
-     transaction checkpoint its value. */
-  if (liveinfo->livein)
+  for (gsi = gsi_start_phis (e->dest); !gsi_end_p (gsi); )
     {
-      if (dump_file)
-	fprintf (dump_file, "\nCheckpoint variables for transaction. BB %d : ", index);
-
-      EXECUTE_IF_SET_IN_BITMAP (liveinfo->livein[index], 0, i, bi)
+      gimple phi = gsi_stmt (gsi);
+      tree old_ssa, new_ssa;
+      unsigned i, n;
+
+      new_ssa = gimple_phi_result (phi);
+      old_ssa = gimple_phi_arg_def (phi, e->dest_idx);
+
+      /* We need to recompute SSA_NAME_OCCURS_IN_ABNORMAL_PHI on each
+	 of the other arguments to the PHI, discounting the one abnormal
+	 phi node that we're about to delete.  */
+      for (i = 0, n = gimple_phi_num_args (phi); i < n; ++i)
 	{
-	  tree var =  partition_to_var (map, i);
+	  tree arg = gimple_phi_arg_def (phi, i);
+	  imm_use_iterator imm_iter;
+	  use_operand_p use_p;
+	  bool in_abnormal_phi;
+
+	  if (TREE_CODE (arg) != SSA_NAME
+	      || !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (arg))
+	    continue;
 
-	  /* TODO check restricts the use of temporaries by the compiler
-	     may impact other optimisations.
-	     Maybe reordering this part of the checkpointing before introducing
-	     temporary variables would avoid this check. */
-	  if ((!DECL_ARTIFICIAL (SSA_NAME_VAR (var)))
-	      && (!POINTER_TYPE_P (TREE_TYPE (var))))
+	  in_abnormal_phi = false;
+	  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, arg)
 	    {
-	      if (dump_file)
+	      gimple phi2 = USE_STMT (use_p);
+	      if (gimple_code (phi2) == GIMPLE_PHI && phi2 != phi)
 		{
-		  print_generic_expr (dump_file, var, TDF_SLIM);
-		  fprintf (dump_file, "  ");
-		}
-	      /* Create name for temporary variable
-		 that checkpoints value of var. */
-	      const char* orig = get_name (SSA_NAME_VAR (var));
-	      int len = strlen (orig);
-	      char *name = xmalloc (sizeof (char) * (len + 10));
-	      strncpy (name, "txn_save_", 9);
-	      strncpy (name + 9, orig, len);
-	      *(name + len + 9) = '\0';
-
-	      /* Create temporary. */
-	      tree type = TREE_TYPE (var);
-	      tree save = create_tmp_var (type, name);
-	      add_referenced_var (save);
-	      tree stmt;
-
-	      /* Create gimple statement for saving value of var. */
-	      stmt = fold_build2 (GIMPLE_MODIFY_STMT, type, save, var);
-	      tree real_save = make_ssa_name (save, stmt);
-	      SSA_NAME_OCCURS_IN_ABNORMAL_PHI (real_save) = true;
-	      GIMPLE_STMT_OPERAND (stmt, 0) = real_save;
-
-	      bsi_insert_before (&bsi_save, stmt, BSI_SAME_STMT);
-
-	      /* Create gimple statement for restoring value of var. */
- 	      stmt = fold_build2 (GIMPLE_MODIFY_STMT, type, var, real_save);
-	      tree new_var = make_ssa_name (SSA_NAME_VAR (var), stmt);
-	      GIMPLE_STMT_OPERAND (stmt, 0) = new_var;
-	      bsi_insert_before (bsi_recover, stmt, BSI_SAME_STMT);
-
-	      /* Merge saved or recovered values before next basic block. */
-	      tree phi = create_phi_node (SSA_NAME_VAR (var), begin_bb);
-	      add_phi_arg (phi, new_var, FALLTHRU_EDGE (recover_bb));
-	      add_phi_arg (phi, var, FALLTHRU_EDGE (save_bb));
-	      tree new_var_phi = PHI_RESULT (phi);
-
-	      free_dominance_info (CDI_DOMINATORS);
-	      calculate_dominance_info (CDI_DOMINATORS);
-
-	      tree stmt2;
-	      imm_use_iterator iter;
-	      use_operand_p use_p;
-	      FOR_EACH_IMM_USE_STMT (stmt2, iter, var)
-		{
-		  if (stmt2 == phi)
-		    continue;
-
-		  basic_block tmp_bb = bb_for_stmt (stmt2);
-		  if (dominated_by_p (CDI_DOMINATORS, tmp_bb, begin_bb))
+		  unsigned ix = PHI_ARG_INDEX_FROM_USE (use_p);
+		  if (gimple_phi_arg_edge (phi2, ix)->flags & EDGE_ABNORMAL)
 		    {
-		      FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
-			propagate_value (use_p, new_var_phi);
+		      in_abnormal_phi = true;
+		      break;
 		    }
 		}
 	    }
+	  SSA_NAME_OCCURS_IN_ABNORMAL_PHI (arg) = in_abnormal_phi;
 	}
-      if (dump_file)
-	fprintf (dump_file, "\n");
 
+      replace_uses_by (new_ssa, old_ssa);
+      remove_phi_node (&gsi, true);
     }
-  update_ssa(TODO_update_ssa);
-
-  return ;
 }
 
-/* Implements the checkpointing of transactions. */
 static void
-checkpoint_tm_txn (struct tm_region *region)
+expand_tm_atomic (basic_block bb, gimple_stmt_iterator *gsi)
 {
-  basic_block entry_bb = bb_for_stmt (region->setjmp_stmt);
-
-  edge branch = BRANCH_EDGE (entry_bb);
-  edge fall = FALLTHRU_EDGE (entry_bb);
+  tree status, tm_start;
+  basic_block body_bb, test_bb;
+  gimple_stmt_iterator gsi2;
+  tree t1, t2;
+  gimple g;
+  edge e;
+
+  tm_start = built_in_decls[BUILT_IN_TM_START];
+  status = make_rename_temp (TREE_TYPE (TREE_TYPE (tm_start)), "tm_state");
+
+  e = FALLTHRU_EDGE (bb);
+  body_bb = e->dest;
+  checkpoint_live_in_variables (e);
 
-  basic_block begin_bb = fall->dest;
-  basic_block recover_bb = branch->dest;
-  basic_block next_bb = single_succ (recover_bb);
-
-  gcc_assert(begin_bb == next_bb);
-  block_stmt_iterator bsi_recover = bsi_start (recover_bb);
-  gcc_assert (TREE_CODE (bsi_stmt (bsi_recover)) == LABEL_EXPR);
-
-  bsi_next (&bsi_recover);
-
-  checkpoint_live_in_variables (region, &bsi_recover, begin_bb);
-}
-
-/* Walk the region tree and start checkpointing. */
-static void
-checkpoint_tm (struct tm_region *region)
-{
-  while (region)
+  if (gimple_tm_atomic_label (gsi_stmt (*gsi)))
     {
-      /* First, introduce checkpoints for the inner regions.
-	 TODO: testing. Overlapping of inner and outer
-	 regions not handled correctly.
-	 Nesting of transactions not implemented correctly.*/
-      if (region->inner)
-	{
-	  checkpoint_tm_txn (region->inner);
-	}
-
-      checkpoint_tm_txn (region);
-
-      region = region->next;
+      e = split_block_after_labels (body_bb);
+      test_bb = e->src;
+      body_bb = e->dest;
+
+      gsi2 = gsi_last_bb (test_bb);
+
+      t1 = make_rename_temp (TREE_TYPE (status), NULL);
+      t2 = build_int_cst (TREE_TYPE (status), TM_START_ABORT);
+      g = gimple_build_assign_with_ops (BIT_AND_EXPR, t1, status, t2);
+      gsi_insert_after (&gsi2, g, GSI_CONTINUE_LINKING);
+
+      t2 = build_int_cst (TREE_TYPE (status), 0);
+      g = gimple_build_cond (NE_EXPR, t1, t2, NULL, NULL);
+      gsi_insert_after (&gsi2, g, GSI_CONTINUE_LINKING);
+
+      single_succ_edge (test_bb)->flags = EDGE_FALSE_VALUE;
+
+      e = BRANCH_EDGE (bb);
+      redirect_edge_pred (e, test_bb);
+      e->flags = EDGE_TRUE_VALUE;
     }
+
+  /* ??? Need to put the real input to __tm_start here.  */
+  t2 = build_int_cst (TREE_TYPE (status), 0);
+  g = gimple_build_call (tm_start, 1, t2);
+  gimple_call_set_lhs (g, status);
+  gsi_insert_before (gsi, g, GSI_SAME_STMT);
+  gsi_remove (gsi, true);
 }
 
 /* Entry point to the checkpointing. */
-void
+
+static unsigned int
 execute_checkpoint_tm (void)
 {
-  /* Regions are built during TM expansion pass. */
-  if (!root_tm_region)
-    return;
-
-  /* Checkpointing is done here. */
-  checkpoint_tm (root_tm_region);
+  basic_block bb;
 
-  if (dump_file)
+  FOR_EACH_BB (bb)
     {
-      fprintf (dump_file, "\nTM region tree after checkpointing\n\n");
-      dump_tm_region (dump_file, root_tm_region, 0);
-      fprintf (dump_file, "\n");
+      gimple_stmt_iterator gsi = gsi_last_bb (bb);
+      if (!gsi_end_p (gsi)
+	  && gimple_code (gsi_stmt (gsi)) == GIMPLE_TM_ATOMIC)
+	expand_tm_atomic (bb, &gsi);
     }
 
-  free_dominance_info (CDI_DOMINATORS);
-  cleanup_tree_cfg ();
-  free_tm_regions ();
-
-  return;
+  return 0;
 }
 
-struct tree_opt_pass pass_checkpoint_tm =
+struct gimple_opt_pass pass_checkpoint_tm =
 {
+ {
+  GIMPLE_PASS,
   "tmcheckpoint",			/* name */
   gate_tm,				/* gate */
   execute_checkpoint_tm,		/* execute */
@@ -451,6 +654,84 @@ struct tree_opt_pass pass_checkpoint_tm 
   TODO_update_ssa |
   TODO_verify_ssa |
   TODO_dump_func,			/* todo_flags_finish */
-  0					/* letter */
+ }
 };
-#endif
+\f
+/* Construct transaction restart edges for STMT.  */
+
+static void
+make_tm_edge_1 (struct eh_region *region, void *data)
+{
+  gimple stmt = (gimple) data;
+  basic_block src, dst;
+  unsigned flags;
+
+  src = gimple_bb (stmt);
+  dst = label_to_block (get_eh_region_tree_label (region));
+
+  /* Don't set EDGE_EH here, because that's supposed to be used when
+     we could in principal redirect the edge by modifying the exception
+     tables.  Transactions don't use tables though, only setjmp.  */
+  flags = EDGE_ABNORMAL;
+  if (gimple_code (stmt) == GIMPLE_CALL)
+    flags |= EDGE_ABNORMAL_CALL;
+  make_edge (src, dst, flags);
+}
+
+void
+make_tm_edge (gimple stmt)
+{
+  int region_nr;
+
+  /* Do nothing if the region is outside this function.  */
+  region_nr = lookup_stmt_eh_region (stmt);
+  if (region_nr < 0)
+    return;
+
+  /* The control structure inside tree-cfg.c isn't the best;
+     re-check whether this is actually a transactional stmt.  */
+  if (!is_transactional_stmt (stmt))
+    return;
+
+  foreach_reachable_transaction (region_nr, make_tm_edge_1, (void *) stmt);
+}
+
+/* Return true if STMT may alter control flow via a transactional edge.  */
+
+bool
+is_transactional_stmt (const_gimple stmt)
+{
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      {
+	/* We only want to process assignments that have been
+	   marked for transactional memory.  */
+	enum tree_code subcode = gimple_expr_code (stmt);
+	return (subcode == TM_LOAD || subcode == TM_STORE);
+      }
+
+    case GIMPLE_CALL:
+      {
+	tree fn_decl = gimple_call_fndecl (stmt);
+
+	/* We only want to process __tm_abort and cloned direct calls,
+	   since those are the only ones that can restart the transaction.  */
+	if (!fn_decl)
+	  return false;
+	if (DECL_BUILT_IN_CLASS (fn_decl) == BUILT_IN_NORMAL
+	    && DECL_FUNCTION_CODE (fn_decl) == BUILT_IN_TM_ABORT)
+	  return true;
+	if (DECL_IS_TM_CLONE (fn_decl))
+	  return true;
+	else
+	  return false;
+      }
+
+    case GIMPLE_TM_ATOMIC:
+      return true;
+
+    default:
+      return false;
+    }
+}
--- passes.c	(revision 141162)
+++ passes.c	(local)
@@ -515,6 +515,7 @@ init_optimization_passes (void)
   NEXT_PASS (pass_lower_cf);
   NEXT_PASS (pass_refactor_eh);
   NEXT_PASS (pass_lower_eh);
+  NEXT_PASS (pass_lower_tm);
   NEXT_PASS (pass_build_cfg);
   NEXT_PASS (pass_lower_complex_O0);
   NEXT_PASS (pass_lower_vector);
@@ -540,13 +541,11 @@ init_optimization_passes (void)
       NEXT_PASS (pass_cleanup_cfg);
       NEXT_PASS (pass_init_datastructures);
       NEXT_PASS (pass_expand_omp);
-      NEXT_PASS (pass_expand_tm);
 
       NEXT_PASS (pass_referenced_vars);
       NEXT_PASS (pass_reset_cc_flags);
       NEXT_PASS (pass_build_ssa);
       NEXT_PASS (pass_early_warn_uninitialized);
-      /* NEXT_PASS (pass_checkpoint_tm); */
       NEXT_PASS (pass_all_early_optimizations);
 	{
 	  struct opt_pass **p = &pass_all_early_optimizations.pass.sub;
@@ -565,6 +564,7 @@ init_optimization_passes (void)
 	  NEXT_PASS (pass_convert_switch);
           NEXT_PASS (pass_profile);
 	}
+      NEXT_PASS (pass_checkpoint_tm);
       NEXT_PASS (pass_release_ssa_names);
       NEXT_PASS (pass_rebuild_cgraph_edges);
       NEXT_PASS (pass_inline_parameters);
--- tree-cfg.c	(revision 141162)
+++ tree-cfg.c	(local)
@@ -514,6 +514,10 @@ make_edges (void)
 		 create abnormal edges to them.  */
 	      make_eh_edges (last);
 
+	      /* If this statement calls a transaction clone,
+		 add transactional restart edges.  */
+	      make_tm_edge (last);
+
 	      /* Some calls are known not to return.  */
 	      fallthru = !(gimple_call_flags (last) & ECF_NORETURN);
 	      break;
@@ -524,6 +528,7 @@ make_edges (void)
 	      if (is_ctrl_altering_stmt (last))
 		{
 		  make_eh_edges (last);
+		  make_tm_edge (last);
 		}
 	      fallthru = true;
 	      break;
@@ -613,11 +618,12 @@ make_edges (void)
 	      break;
 
 	    case GIMPLE_TM_ATOMIC:
-	      /* ??? The edge from __tm_atomic to the out-label is only
-		 used when __tm_abort is present.  Detect that's not
-		 present and omit it.  */
-	      make_edge (bb, label_to_block (gimple_tm_atomic_label (last)), 0);
-	      fallthru = true;
+	      {
+		tree abort_label = gimple_tm_atomic_label (last);
+		if (abort_label)
+		  make_edge (bb, label_to_block (abort_label), 0);
+		fallthru = true;
+	      }
 	      break;
 
 	    default:
@@ -988,22 +994,30 @@ cleanup_dead_labels (void)
   FOR_EACH_BB (bb)
     {
       gimple stmt = last_stmt (bb);
+      tree label, new_label;
+
       if (!stmt)
 	continue;
 
       switch (gimple_code (stmt))
 	{
 	case GIMPLE_COND:
-	  {
-	    tree true_label = gimple_cond_true_label (stmt);
-	    tree false_label = gimple_cond_false_label (stmt);
+	  label = gimple_cond_true_label (stmt);
+	  if (label)
+	    {
+	      new_label = main_block_label (label);
+	      if (new_label != label)
+		gimple_cond_set_true_label (stmt, new_label);
+	    }
 
-	    if (true_label)
-	      gimple_cond_set_true_label (stmt, main_block_label (true_label));
-	    if (false_label)
-	      gimple_cond_set_false_label (stmt, main_block_label (false_label));
-	    break;
-	  }
+	  label = gimple_cond_false_label (stmt);
+	  if (label)
+	    {
+	      new_label = main_block_label (label);
+	      if (new_label != label)
+		gimple_cond_set_false_label (stmt, new_label);
+	    }
+	  break;
 
 	case GIMPLE_SWITCH:
 	  {
@@ -1013,8 +1027,10 @@ cleanup_dead_labels (void)
 	    for (i = 0; i < n; ++i)
 	      {
 		tree case_label = gimple_switch_label (stmt, i);
-		tree label = main_block_label (CASE_LABEL (case_label));
-		CASE_LABEL (case_label) = label;
+		label = CASE_LABEL (case_label);
+		new_label = main_block_label (label);
+		if (new_label != label)
+		  CASE_LABEL (case_label) = label;
 	      }
 	    break;
 	  }
@@ -1024,10 +1040,24 @@ cleanup_dead_labels (void)
 	case GIMPLE_GOTO:
           if (!computed_goto_p (stmt))
 	    {
-	      tree new_dest = main_block_label (gimple_goto_dest (stmt));
-	      gimple_goto_set_dest (stmt, new_dest);
-	      break;
+	      label = gimple_goto_dest (stmt);
+	      new_label = main_block_label (label);
+	      if (new_label != label)
+		gimple_goto_set_dest (stmt, new_label);
 	    }
+	  break;
+
+	case GIMPLE_TM_ATOMIC:
+	  {
+	    tree label = gimple_tm_atomic_label (stmt);
+	    if (label)
+	      {
+		tree new_label = main_block_label (label);
+		if (new_label != label)
+		  gimple_tm_atomic_set_label (stmt, new_label);
+	      }
+	  }
+	  break;
 
 	default:
 	  break;
@@ -2578,12 +2608,14 @@ is_ctrl_altering_stmt (gimple t)
   if (is_gimple_omp (t))
     return true;
 
-  /* __tm_atomic can alter control flow.  */
-  if (gimple_code (t) == GIMPLE_TM_ATOMIC)
+  /* If a statement can throw, it alters control flow.  */
+  if (stmt_can_throw_internal (t))
     return true;
 
-  /* If a statement can throw, it alters control flow.  */
-  return stmt_can_throw_internal (t);
+  if (flag_tm && is_transactional_stmt (t))
+    return true;
+
+  return false;
 }
 
 
@@ -4085,12 +4117,15 @@ verify_stmt (gimple_stmt_iterator *gsi)
      to match.  */
   if (lookup_stmt_eh_region (stmt) >= 0)
     {
-      if (!stmt_could_throw_p (stmt))
+      bool is_tm = is_transactional_stmt (stmt);
+
+      if (!is_tm && !stmt_could_throw_p (stmt))
 	{
 	  error ("statement marked for throw, but doesn%'t");
 	  goto fail;
 	}
-      if (!last_in_block && stmt_can_throw_internal (stmt))
+      /* ??? Add a transactional function akin to stmt_can_throw_internal.  */
+      if (!last_in_block && !is_tm && stmt_can_throw_internal (stmt))
 	{
 	  error ("statement marked for throw in middle of block");
 	  goto fail;
--- tree-eh.c	(revision 141162)
+++ tree-eh.c	(local)
@@ -310,6 +310,10 @@ collect_finally_tree (gimple stmt, gimpl
       collect_finally_tree_1 (gimple_eh_filter_failure (stmt), region);
       break;
 
+    case GIMPLE_TM_ATOMIC:
+      collect_finally_tree_1 (gimple_seq_body (stmt), region);
+      break;
+
     default:
       /* A type, a decl, or some kind of statement that we're not
 	 interested in.  Don't walk them.  */
@@ -1869,6 +1873,21 @@ lower_eh_constructs_2 (struct leh_state 
       /* Return since we don't want gsi_next () */
       return;
 
+    case GIMPLE_TM_ATOMIC:
+      {
+        /* Record the transaction region in the EH tree, then process
+	   the body of the transaction.  We don't lower the transaction
+	   node just yet.  */
+	struct eh_region *outer = state->cur_region;
+	state->cur_region = gen_eh_region_transaction (outer);
+
+	record_stmt_eh_region (state->cur_region, stmt);
+	lower_eh_constructs_1 (state, gimple_seq_body (stmt));
+
+	state->cur_region = outer;
+      }
+      break;
+
     default:
       /* A type, a decl, or some kind of statement that we're not
 	 interested in.  Don't walk them.  */
@@ -2043,6 +2062,9 @@ verify_eh_edges (gimple stmt)
 	}
       if (!stmt_could_throw_p (stmt))
 	{
+	  /* ??? Add bits to verify transactional edges.  */
+	  if (is_transactional_stmt (stmt))
+	    return false;
 	  error ("BB %i last statement has incorrectly set region", bb->index);
 	  return true;
 	}
--- tree-flow.h	(revision 141162)
+++ tree-flow.h	(local)
@@ -1079,6 +1079,9 @@ extern int lookup_expr_eh_region (tree);
 extern int lookup_stmt_eh_region (gimple);
 extern bool verify_eh_edges (gimple);
 
+/* In gtm-low.c  */
+extern void make_tm_edge (gimple);
+extern bool is_transactional_stmt (const_gimple);
 
 /* In tree-ssa-pre.c  */
 struct pre_expr_d;
--- tree-pass.h	(revision 141162)
+++ tree-pass.h	(local)
@@ -388,7 +388,7 @@ extern struct gimple_opt_pass pass_reass
 extern struct gimple_opt_pass pass_rebuild_cgraph_edges;
 extern struct gimple_opt_pass pass_build_cgraph_edges;
 extern struct gimple_opt_pass pass_reset_cc_flags;
-extern struct gimple_opt_pass pass_expand_tm;
+extern struct gimple_opt_pass pass_lower_tm;
 extern struct gimple_opt_pass pass_checkpoint_tm;
 
 /* IPA Passes */

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [trans-mem] rewrite transaction lowering
  2008-10-18  2:02 [trans-mem] rewrite transaction lowering Richard Henderson
@ 2008-10-24  0:12 ` Albert Cohen
  2008-10-24  8:50   ` Jakub Jelinek
  0 siblings, 1 reply; 3+ messages in thread
From: Albert Cohen @ 2008-10-24  0:12 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

Richard Henderson wrote:
> The lowering of transactions are now much more to my liking.  It's a 5 
> step process, described in detail in a block comment at the beginning of 
> gtm-low.c.  The important functional changes are (1) we properly commit 
> the transaction when we leave the region via an exception, and (2) we 
> accurately describe the CFG, and the abnormal edges created by the 
> longjmp used to restart or abort the transaction.

Thanks a lot Richard for getting so quickly into the GTM code. Martin 
and I are thrilled to see that some of our long-standing issues have 
been solved so nicely.

> Still to do is rtl expansion of the TM_LOAD/STORE nodes, rewriting
> creation of the transactional clones as real IPA pass, duplicating the 
> transaction region without annotations as a fast-path when the 
> transaction gets (re-)started in sequential mode.

Indeed.

We also expect much performance boost by movint into a real IPA pass. 
Being able to mark the read/write barriers with the proper VUSE and VDEF 
is known to seriously inhibit optimizations that would normally apply on 
the non-instrumented variant of a function (or outside a tm_atomic block).

We also expect many redundancies to disappear once the instrumentation 
(not only the checkpointing) will operate on SSA form, after PRE removes 
redundant load/store instructions.

It may also help a lot to remove redundant barriers associated with 
already acquired read/write locks from previous ones (assuming a write 
lock subsumes a read lock, many redundancies occur, see Tabatabai and 
al's CGO'07 paper).

We have seen these problems occuring in real-world parallelization 
experiments, when extending the par-loops pass with automatic TM 
insertion for generalized reduction support. Andrea Marongiu from U. 
Bologna is investigating this issue, and any feedback or idea or 
benchmark motivating this approach would be very welcome.

> What I generate is no longer compatible with TinySTM.  It's closer to 
> what I believe the final ABI should look like.  At some point I'll 
> either import some existing GPL STM library (TinySTM, unless someone has 
> a better suggestion) or start a new one from scratch.  We'll see...

I just notice one stupid thing that reminds me of some license problem 
we had with Graphite... TinySTM is GPLv2, not GPLv2+ :-( I guess the 
problem could be trivially solved by updating the license in this case, 
as the copyright holders are well identified. Yet TinySTM is not a very 
big piece of code, and it would indeed make sense to derive a 
GCC-specific STM from it. I don't see any urgency here, however.

Also, one emphasis of our GTM project is to help TM research (Martin 
Schindewolf is funded by the HiPEAC european research network,
http://www.hipeac.net). This means being able to retarget GTM to various 
TM runtimes, including hybrid ones (e.g., for Sun's HW, or for 
simulators). This emphasis should not inhibit a quickpath to a 
production-level support for TM in GCC, but interchangeability of the 
runtime is clearly important, as the existing runtimes are far from 
mature and many improvements could arise from third-party research and 
developments.

Albert

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [trans-mem] rewrite transaction lowering
  2008-10-24  0:12 ` Albert Cohen
@ 2008-10-24  8:50   ` Jakub Jelinek
  0 siblings, 0 replies; 3+ messages in thread
From: Jakub Jelinek @ 2008-10-24  8:50 UTC (permalink / raw)
  To: Albert Cohen; +Cc: Richard Henderson, gcc-patches

On Fri, Oct 24, 2008 at 12:07:07AM +0200, Albert Cohen wrote:
> It may also help a lot to remove redundant barriers associated with  
> already acquired read/write locks from previous ones (assuming a write  
> lock subsumes a read lock, many redundancies occur, see Tabatabai and  
> al's CGO'07 paper).

The question is how much we want to tighten GCC with the underlying TM
library (and possibly different configurations and/or runtime modes
thereof).  E.g. for read after read or read after write if we know
the implementation is in-place-update, we can just use normal reads,
but if the implementation can be both in-place-update or write-buffering,
we can't do that.

> Also, one emphasis of our GTM project is to help TM research (Martin  
> Schindewolf is funded by the HiPEAC european research network,
> http://www.hipeac.net). This means being able to retarget GTM to various  
> TM runtimes, including hybrid ones (e.g., for Sun's HW, or for  
> simulators). This emphasis should not inhibit a quickpath to a  
> production-level support for TM in GCC, but interchangeability of the  
> runtime is clearly important, as the existing runtimes are far from  
> mature and many improvements could arise from third-party research and  
> developments.

Here the question is whether we want the interchange to be done at the
compiler level or library level.  Obviously for hybrid HW + SW approaches
the HW support needs to be in the compiler (just add the right insn patterns in
the target description and/or target hoos/builtins), for SW if we use
a sufficiently rich ABI (ideally a multi-vendor one), a STM library
shipped with GCC would of course support that ABI and for other TM
libraries you could just write a small interface library that would
translate the ABI calls to your TM library's ABI.

	Jakub

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-10-24  7:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-18  2:02 [trans-mem] rewrite transaction lowering Richard Henderson
2008-10-24  0:12 ` Albert Cohen
2008-10-24  8:50   ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).