public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] _Cilk_for for C and C++
@ 2013-11-15 21:45 Iyer, Balaji V
  2013-11-16  1:38 ` Aldy Hernandez
                   ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Iyer, Balaji V @ 2013-11-15 21:45 UTC (permalink / raw)
  To: gcc-patches, Jeff Law, Jason Merrill (jason@redhat.com)
  Cc: Aldy Hernandez (aldyh@redhat.com), rth

[-- Attachment #1: Type: text/plain, Size: 10044 bytes --]

[Sorry if you received multiple copies, the first one I had forgotten to CC gcc-patches]

Hello Everyone,
	Attached, please find two patches for implementing _Cilk_for for C and C++. I am attaching both of them in the same email because they both share a lot of common routines and the C++ require certain quick fix-ups that C doesn't need and vice-versa. So, if you could see both the patches then these kind of things are easier to see (at least for me it is).

This patch is dependent on the following patches:

#pragma simd work (they both share the same parser routines)

_Cilk_spawn and _Cilk_sync for C++: The Runtime that supports _Cilk_for requires a C++ compiler with _Cilk_spawn and _Cilk_sync support, and the Runtime is required to execute _Cilk_for tests: http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01807.html. The C part is approved but C++ work is still under review. 


One small thing that I have not done that Jakub and several other have asked me before is that, there are no tests in c-c++-common for _Cilk_for. The reason being that the syntax between C and C++ implementations are different. In C++, the induction variable must be defined in the initializer (e.g. it should start wth _Cilk_for (int ii = 0....)). In C, this is not allowed (e.g. it should start as _Cilk_for (ii = 0; ii < 10; ii++)).

Here are the ChangeLog entries:

gcc/ChangeLog.
2013-11-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* cilk-builtins.def: Added 2 builtin functions: __cilkrts_cilk_for_64
	and __cilkrts_cilk_for_32.
	* cilk-common.c (cilk_declare_looper): New function.
	(cilk_init_builtins): Added two calls to cilk_declare_looper.
	* cilk.h (enum cilk_tree_index): Added two enums: CILK_TI_F_LOOP_32
	and CILK_TI_F_LOOP_64.
	(enum add_variable_type): Moved here from c-family/cilk.c
	(enum cilk_block_type): Likewise.
	(struct wrapper_data): Likewise.
	(struct cilk_for_desc): New struct.
	(cilk_for_32_fndecl): New #define.
	(cilk_for_64_fndecl): Likewise.
	* tree.h (CILK_FOR_INIT): Likewise.
	(CILK_FOR_COND): Likewise.
	(CILK_FOR_EXPR): Likewise.
	(CILK_FOR_BODY): Likewise.
	(CILK_FOR_SCOPE): Likewise.
	(CILK_FOR_GRAIN): Likewise.
	(CILK_FOR_VAR): Likewise.
	* gimplify.c (gimplify_expr): Added CILK_FOR_STMT case.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* langhooks-def.h (LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR): New
	#define.
	(LANG_HOOKS_CILKPLUS): Added LANG_HOOKS_CILKPLUS_GIMPLIFY_FOR field.
	* langhooks.h (struct lang_hooks_for_cilkplus): Added a new field
	gimplify_cilk_for.
	* tree.def: Added a new tree CILK_FOR_STMT.

gcc/c-family/ChangeLog.
2013-11-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-cilkplus.c (c_check_cilk_loop_incr): Added a check for INDIRECT_REF,
	CLEANUP_POINT_EXPR and TARGET_EXPR.  Added PLUS_EXPR and POINTER_PLUS
	EXPR cases for MODIFY_EXPR case.  Added a CALL_EXPR case.
	(c_validate_cilk_plus_loop): Added "_Cilk_for loops" in the error
	reporting string.
	(c_check_cilk_loop): Added 2 new boolean parameters.  Added handling
	of CALL_EXPR and removed unwanted wrappers from the condition params.
	(c_finish_cilk_for_loops): New function.
	(cp_finish_cilk_for_loops): Likewise.
	* c-common.c (c_common_resword): Added _Cilk_for keyword.
	* c-common.h (enum rid): Added RID_CILK_FOR.
	(cp_finish_cilk_for_loop): New prototype.
	(c_finish_cilk_for_loop): Likewise.
	(cilk_init_fd): Likewise.
	(cilk_extract_free_variables): Likewise.
	(cilk_create_cilk_helper_decl): Likewise.
	(cilk_call_graph_add_fn): Likewise.
	(cilk_outline_body): Likewise.
	(cilk_check_loop_difference_type): Likewise.
	(declare_cilk_for_parms): Likewise.
	(declare_cilk_for_vars): Likewise.
	(cilk_loop_convert): Likewise.
	(cilk_divide_count): Likewise.
	(cilk_calc_forward_div_op): Likewise.
	(cilk_compute_loop_count): Likewise.
	(insert_cilk_for_nested_fn): Likewise.
	(cilk_compute_loop_var): Likewise.
	(cilk_set_inclusive_and_direction): Likewise.
	(cilk_set_iter_difftype): Likewise.
	(cilk_set_incr_info): Likewise.
	(cilk_set_init_info): Likewise.
	(clk_simplify_tree): Likewise.
	(cilk_find_code_from_call): Likewise.
	(cilk_tree_operand_noconv): Likewise.
	(cilk_resolve_continue_stmts): Likewise.
	* c-pragma.c (init_pragma): Added pragma grainsize.
	* c-pragma.h (enum pragma_kind): Added PRAGMA_CILK_GRAINSIZE.
	* cilk.c (enum add_variable_type): Moved to ../cilk.h.
	(enum cilk_block_type): Likewise.
	(struct wrapper_data): Likewise.
	(cilk_call_graph_add_fn): New function.
	(cilk_create_cilk_helper_decl): Likewise.
	(cilk_outline): Renamed to cilk_outline_body.  Also added a parameter
	to hold throw flag for C++.
	(cilk_create_wrapper_body): Renamed create_cilk_helper_decl,
	call_graph_add_fn and cilk_outline to cilk_create_cilk_helper_decl,
	cilk_call_graph_add_fn, and cilk_outline_body, respectively.
	(create_cilk_wrapper): Renamed extact_free_variables to
	cilk_extract_free_variables.
	(extract_free_variables): Likewise.
	(cilk_init_cfd): New function.
	(find_cilk_for_library_fn): Likewise.
	(cilk_compute_incr_direction): Likewise.
	(cilk_check_loop_difference_type): Likewise.
	(cilk_simplify_tree): Likewise.
	(declare_cilk_for_vars): Likewise.
	(declare_cilk_for_parms): Likewise.
	(cilk_loop_convert): Likewise.
	(cilk_divide_count): Likewise.
	(cilk_calc_forward_div_op): Likewise.
	(cilk_compute_loop_count): Likewise.
	(insert_cilk_for_nested_fn): Likewise.
	(cilk_compute_loop_var): Likewise.
	(cilk_tree_operand_noconv): Likewise.
	(cilk_find_code_from_call): Likewise.
	(cilk_set_init_info): Likewise.
	(cilk_set_inclusive_and_direction): Likewise.
	(cilk_set_iter_difftype): Likewise.
	(cilk_set_incr_info): Likewise.

gcc/c/ChangeLog.
2013-11-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* Make-lang.in (C_AND_OBJC_OBJS): Added c/c-cilk.o.
	* c-cilk.c: New file.
	* c-objc-common.h (LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR): New
	#define.
	* c-parser.c (c_parser_cilk_for_statement): New function prototype.
	(c_parser_cilk_grainsize): New function prototype and function.
	(c_parser_statement_after_labels): Added RID_CILK_FOR case.
	(c_parser_pragma): Added PRAGMA_CILK_GRAINSIZE case.
	(c_parser_cilk_for_statement): Renamed a parameter.  Added code to
	accomodate RID_CILK_FOR tree (i.e. to parse _Cilk_for statements).
	* c-tree.h (c_gimplify_cilk_for): New prototype.

gcc/testsuite/ChangeLog.
2013-11-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* gcc.dg/cilk-plus/CK/cilk-for.c: New test.
	* gcc.dg/cilk-plus/CK/cilk_for_decr.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_errors.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_grain.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_ptr_iter.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_warning.c: Likewise.
	* gcc.dg/cilk-plus/cilk-plus.exp: Added support to call _Cilk_for
	testcodes.

gcc/cp/ChangeLog.
2013-11-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* cp-cilk.c: Added langhooks.h and tree.h.
	(callable): New function.
	(calc_count_up_count_down): Likewise.
	(compute_loop_var_cp_iter_hdl): Likewise.
	(cp_create_cilk_for_body): Likewise.
	(create_cilk_for_nested_fn): Likewise.
	(gimplify_cilk_for_1): Likewise.
	(cp_extract_cilk_for_fields): Likewise.
	(cp_gimplify_cilk_for): Likewise.
	* cp-gimplify.c (genericize_cilk_for_stmt): Likewise.
	(cp_genericize_r): Added a check for CILK_FOR_STMT.
	* cp-objcp-common.h (LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR): New
	#define.
	* cp-tree.h (begin_cilk_for_stmt): New prototype.
	(finish_cilk_for_stmt): Likewise.
	(finish_cilk_for_init_stmt): Likewise.
	(cp_gimplify_cilk_for): Likewise.
	* name-lookup.c (begin_scope): Added sk_cilk_for case.
	* name-lookup.h (enum scope_kind): Added sk_cilk_for.
	* parser.c (cp_parser_cilk_grainsize): New function and prototype.
	(cp_parser_init_declarator): Added a new parameter to hold the
	initial value.
	(cp_parser_statement): Added RID_CILK_FOR case.
	(cp_parser_iteration_statement): Likewise.
	(cp_parser_jump_statement): Added IN_CILK_FOR_STMT case (twice).
	(cp_parser_pragma): Added PRAGMA_CILK_GRAINSIZE case.
	(cp_parser_cilk_for_init_statement): New function.
	(cp_parser_cilk_for): Renamed a parameter and added support for
	parsing _Cilk_for loops that are part of Cilk keywords.
	* parser.h (IN_CILK_FOR_STMT): New #define.
	* pt.c (tsubst_expr): Added CILK_FOR_STMT case.
	* semantics.c (begin_for_scope): Added "_Cilk_for statement" in the
	header comment.
	(finish_for_expr): Added support for CILK_FOR_STMT to use this
	function.
	(finish_cilk_for_cond): Added support for processing templates.
	(begin_cilk_for_stmt): New function.
	(finish_cilk_for_init_stmt): Likewise.
	(finish_clk_for_stmt): Likewise.

gcc/testsuite/ChangeLog.
2013-11-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc: New test.
	* g++.dg/cilk-plus/CK/cilk-for-tplt.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk-for.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_grainsize.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_p_errors.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_t_errors.cc: Likewise.
	* g++.dg/cilk-plus/CK/explicit_ctor.cc: Likewise.
	* g++.dg/cilk-plus/CK/label_test.cc: Likewise.
	* g++.dg/cilk-plus/CK/no-opp-overload-error.cc: Likewise.
	* g++.dg/cilk-plus/CK/plus-equal-one.cc: Likewise.
	* g++.dg/cilk-plus/CK/plus-equal-test.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_test.cc: Likewise.
	* g++.dg/cilk-plus/cilk-plus.exp: Added support to call _Cilk_for
	testcodes.


Thanking You,

Yours Sincerely,

Balaji V. Iyer.

[-- Attachment #2: diff_cilk_for_c.txt --]
[-- Type: text/plain, Size: 100475 bytes --]

diff --git a/gcc/c-family/c-cilkplus.c b/gcc/c-family/c-cilkplus.c
index cfaa10e..2c4233b 100644
--- a/gcc/c-family/c-cilkplus.c
+++ b/gcc/c-family/c-cilkplus.c
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 static tree
 c_check_cilk_loop_incr (location_t loc, tree decl, tree incr)
 {
+  tree orig_incr = incr;
   if (!incr)
     {
       error_at (loc, "missing increment");
@@ -50,6 +51,16 @@ c_check_cilk_loop_incr (location_t loc, tree decl, tree incr)
   if (EXPR_HAS_LOCATION (incr))
     loc = EXPR_LOCATION (incr);
 
+  /* We hit this if-statement if we have an overloaded operand like
+     this: *my_class::operator+= (&ii, 1).  For example, see the testscase
+     plus-equal-test.cc.  */
+  if (TREE_CODE (incr) == INDIRECT_REF
+      || TREE_CODE (incr) == CLEANUP_POINT_EXPR)
+    incr = TREE_OPERAND (incr, 0);
+
+  if (TREE_CODE (incr) == TARGET_EXPR)
+    incr = TARGET_EXPR_INITIAL (incr);
+
   switch (TREE_CODE (incr))
     {
     case POSTINCREMENT_EXPR:
@@ -69,10 +80,12 @@ c_check_cilk_loop_incr (location_t loc, tree decl, tree incr)
 	  break;
 
 	rhs = TREE_OPERAND (incr, 1);
-	if (TREE_CODE (rhs) == PLUS_EXPR
+	if ((TREE_CODE (rhs) == PLUS_EXPR
+	     || TREE_CODE (rhs) == POINTER_PLUS_EXPR)
 	    && (TREE_OPERAND (rhs, 0) == decl
 		|| TREE_OPERAND (rhs, 1) == decl)
-	    && INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	    && (INTEGRAL_TYPE_P (TREE_TYPE (rhs))
+		|| POINTER_TYPE_P (TREE_TYPE (rhs))))
 	  return incr;
 	else if (TREE_CODE (rhs) == MINUS_EXPR
 		 && TREE_OPERAND (rhs, 0) == decl
@@ -83,6 +96,49 @@ c_check_cilk_loop_incr (location_t loc, tree decl, tree incr)
 	break;
       }
 
+      /* We encounter CALL_EXPR in C++ when we have a case like this:
+	 operator+= (&ii, 1);  */
+    case CALL_EXPR:
+      {
+	enum tree_code code = cilk_find_code_from_call (CALL_EXPR_FN (incr));
+	if (code == POSTINCREMENT_EXPR || code == POSTDECREMENT_EXPR
+	    || code == PREINCREMENT_EXPR || code == PREDECREMENT_EXPR)
+	  {
+	    tree val = CALL_EXPR_ARG (incr, 0);
+	    if (TREE_CODE (val) == ADDR_EXPR
+		|| TREE_CODE (val) == INDIRECT_REF)
+	      val = TREE_OPERAND (val, 0);
+	    if (val != decl)
+	      break;
+	    return c_omp_for_incr_canonicalize_ptr (loc, decl, incr);
+	  }
+	for (int ii = 0; ii < call_expr_nargs (incr); ii++)
+	  {
+	    tree val = CALL_EXPR_ARG (incr, ii);
+	    if (TREE_CODE (val) == ADDR_EXPR)
+	      val = TREE_OPERAND (val, 0);
+	    if (val == decl)
+	      continue;
+	    else
+	      {
+		tree rhs = val;
+		if (TREE_CODE (rhs) == INTEGER_CST)
+		  return orig_incr;
+		if ((TREE_CODE (rhs) == PLUS_EXPR
+		     || TREE_CODE (rhs) == POINTER_PLUS_EXPR)
+		    && (TREE_OPERAND (rhs, 0) == decl
+			|| TREE_OPERAND (rhs, 1) == decl)
+		    && (INTEGRAL_TYPE_P (TREE_TYPE (rhs))
+			|| POINTER_TYPE_P (TREE_TYPE (rhs))))
+		  return orig_incr;
+		else if (TREE_CODE (rhs) == MINUS_EXPR
+			 && TREE_OPERAND (rhs, 0) == decl
+			 && INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+		  return orig_incr;
+	      }
+	  }
+      }
+
     default:
       break;
     }
@@ -121,7 +177,7 @@ c_validate_cilk_plus_loop (tree *tp, int *walk_subtrees, void *data)
 	      {
 		error_at (EXPR_LOCATION (*tp),
 			  "calls to setjmp are not allowed within loops "
-			  "annotated with #pragma simd");
+			  "annotated with #pragma simd or _Cilk_for loops");
 		*valid = false;
 		*walk_subtrees = 0;
 	      }
@@ -185,7 +241,8 @@ c_check_cilk_loop_body (tree body)
 
 static bool
 c_check_cilk_loop (location_t loc, tree decl, tree cond, tree *incrp,
-		   tree body, bool scan_body)
+		   tree body, bool scan_body, bool is_cpp,
+		   bool proc_templates_p)
 {
   tree incr = *incrp;
 
@@ -220,9 +277,22 @@ c_check_cilk_loop (location_t loc, tree decl, tree cond, tree *incrp,
   if (!INTEGRAL_TYPE_P (TREE_TYPE (decl))
       && !POINTER_TYPE_P (TREE_TYPE (decl)))
     {
-      error_at (loc, "induction variable must be of integral "
-		"or pointer type (have %qT)", TREE_TYPE (decl));
-      return false;
+      /* In C++ iterators are allowed.  */
+      if (!is_cpp)
+	{
+	  error_at (loc, "induction variable must be of integral "
+		    "or pointer type (have %qT)", TREE_TYPE (decl));
+	  return false;
+	}
+      /* If we are processing templates then these checks are done
+	 in pt.c.  */
+      else if (!proc_templates_p
+	       && TREE_CODE (TREE_TYPE (decl)) != RECORD_TYPE)
+	{
+	  error_at (loc, "induction variable must be of integral "
+		    "record or pointer type (have %qT)", TREE_TYPE (decl)); 
+	  return false;
+	}
     }
 
   /* Validate the condition.  */
@@ -233,6 +303,7 @@ c_check_cilk_loop (location_t loc, tree decl, tree cond, tree *incrp,
     }
   bool cond_ok = false;
   if (TREE_CODE (cond) == NE_EXPR
+      || TREE_CODE (cond) == CALL_EXPR
       || TREE_CODE (cond) == LT_EXPR
       || TREE_CODE (cond) == LE_EXPR
       || TREE_CODE (cond) == GT_EXPR
@@ -242,9 +313,9 @@ c_check_cilk_loop (location_t loc, tree decl, tree cond, tree *incrp,
 	   DECL <comparison_operator> EXPR
 	   EXPR <comparison_operator> DECL
       */
-      if (decl == TREE_OPERAND (cond, 0))
+      if (decl == cilk_simplify_tree (TREE_OPERAND (cond, 0)))
 	cond_ok = true;
-      else if (decl == TREE_OPERAND (cond, 1))
+      else if (decl == cilk_simplify_tree (TREE_OPERAND (cond, 1)))
 	{
 	  /* Canonicalize the comparison so the DECL is on the LHS.  */
 	  TREE_SET_CODE (cond,
@@ -254,6 +325,29 @@ c_check_cilk_loop (location_t loc, tree decl, tree cond, tree *incrp,
 	  cond_ok = true;
 	}
     }
+
+  /* In C++ you can have cases like this: x < 5
+     where '<' is overloaded and so it is translated like this:
+     operator< (x, 5), and this is acceptable.  */
+  cond = cilk_simplify_tree (cond);
+  if (!cond_ok && is_cpp && TREE_CODE (cond) == CALL_EXPR)
+    {
+      if (call_expr_nargs (cond) < 2)
+	cond_ok = false;
+      for (int ii = 0; ii < call_expr_nargs (cond); ii++)
+	{
+	  tree val = cilk_simplify_tree (CALL_EXPR_ARG (cond, ii));
+	  if (TREE_CODE (val) == ADDR_EXPR)
+	    val = TREE_OPERAND (val, 0);
+	  else if (TREE_CODE (val) == TARGET_EXPR)
+	    val = TARGET_EXPR_INITIAL (val);
+	  if (decl == val)
+	    {
+	      cond_ok = true;
+	      break;
+	    }
+	}
+    }	
   if (!cond_ok)
     {
       error_at (loc, "invalid controlling predicate");
@@ -305,7 +399,8 @@ c_finish_cilk_simd_loop (location_t loc,
 {
   location_t rhs_loc;
 
-  if (!c_check_cilk_loop (loc, decl, cond, &incr, body, scan_body))
+  if (!c_check_cilk_loop (loc, decl, cond, &incr, body, scan_body, false,
+			  false))
     return NULL;
 
   /* In the case of "for (int i = 0...)", init will be a decl.  It should
@@ -352,6 +447,76 @@ c_finish_cilk_simd_loop (location_t loc,
   return add_stmt (t);
 }
 
+/* Validate and emit code for the _Cilk_for loop
+
+   LOC is the location of the location of the _Cilk_for.
+   DECL is the iteration variable.
+   INIT is the initialization expression.
+   COND is the controlling predicate.
+   INCR is the increment expression.
+   BODY is the body of the loop.
+   SCAN_BODY is true if the body of the loop must be verified.
+
+   Returns the generated statement.  */
+
+tree
+c_finish_cilk_for_loop (location_t loc, tree decl, tree init, tree cond,
+			tree incr, tree body, tree grain, bool is_cpp,
+			bool proc_templates_p)
+{
+  if (!c_check_cilk_loop (loc, decl, cond, &incr, body, true,
+			  is_cpp, proc_templates_p))
+    return NULL;
+
+  /* In the case for "_Cilk_for (int i = 0...)", init will be a decl.  It
+     should have a DECL_INITIAL that we can turn into an assignment.  */
+  if (init == decl)
+    {
+      location_t rhs_loc = DECL_SOURCE_LOCATION (decl);
+      init = DECL_INITIAL (decl);
+      if (!init)
+	{
+	  error_at (rhs_loc, "%qE is not initialized", decl);
+	  init = integer_zero_node;
+	  return NULL;
+	}
+      init = build2 (INIT_EXPR, TREE_TYPE (decl), decl, init);
+      DECL_INITIAL (decl) = NULL;
+    }
+
+  tree t = make_node (CILK_FOR_STMT);
+  TREE_TYPE (t) = void_type_node;
+
+  init = build2 (INIT_EXPR, TREE_TYPE (decl), decl, init);
+  CILK_FOR_INIT (t) = init;
+  CILK_FOR_COND (t) = cond;
+  CILK_FOR_EXPR (t) = incr;
+  CILK_FOR_BODY (t) = body;
+  CILK_FOR_SCOPE (t) = NULL_TREE;
+  CILK_FOR_VAR (t) = decl;
+  CILK_FOR_GRAIN (t) = grain;
+
+  SET_EXPR_LOCATION (t, loc);
+  return add_stmt (t);
+}  
+
+/* Similar to c_finish_cilk_for_loop, but don't actually create the
+   CILK_FOR_STMT tree ad return it.  *CILK_FOR_STMT is the CILK_FOR_STMT 
+   tree and proc_templates_p is set if we are processing templates.  */
+
+void
+cp_finish_cilk_for_loop (tree *cilk_for_stmt, bool proc_templates_p)
+{
+  tree cfor = *cilk_for_stmt;
+  tree incr = CILK_FOR_EXPR (cfor);
+  if (!c_check_cilk_loop (EXPR_LOCATION (cfor),	CILK_FOR_VAR (cfor),
+			  CILK_FOR_COND (cfor), &incr, CILK_FOR_BODY (cfor), 
+			  true, true, proc_templates_p))
+    *cilk_for_stmt = error_mark_node;
+  else
+    CILK_FOR_EXPR (*cilk_for_stmt) = incr;
+}
+
 /* Validate and emit code for <#pragma simd> clauses.  */
 
 tree
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 1f5e4ed..4d90633 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -415,6 +415,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Complex",		RID_COMPLEX,	0 },
   { "_Cilk_spawn",      RID_CILK_SPAWN, 0 },
   { "_Cilk_sync",       RID_CILK_SYNC,  0 },
+  { "_Cilk_for",        RID_CILK_FOR,   0 },
   { "_Imaginary",	RID_IMAGINARY, D_CONLY },
   { "_Decimal32",       RID_DFLOAT32,  D_CONLY | D_EXT },
   { "_Decimal64",       RID_DFLOAT64,  D_CONLY | D_EXT },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 2599c97..bb4ab79
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -149,7 +149,7 @@ enum rid
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
   /* Cilk Plus keywords.  */
-  RID_CILK_SPAWN, RID_CILK_SYNC,
+  RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR,
   
   /* Objective-C ("AT" reserved words - they are only keywords when
      they follow '@')  */
@@ -527,7 +527,10 @@ struct GTY(()) c_language_function {
 /* In c-cilkplus.c */
 extern tree c_finish_cilk_simd_loop (location_t, tree, tree, tree, tree,
 				     tree, tree, tree, bool);
+extern void cp_finish_cilk_for_loop (tree *, bool);
 extern tree c_finish_cilk_clauses (tree);
+extern tree c_finish_cilk_for_loop (location_t, tree, tree, tree, tree, tree,
+				    tree, bool, bool);
 
 /* Language-specific hooks.  */
 
@@ -1296,8 +1299,6 @@ extern enum stv_conv scalar_to_vector (location_t loc, enum tree_code code,
 				       tree op0, tree op1, bool);
 
 /* In c-cilkplus.c  */
-extern tree c_finish_cilk_simd_loop (location_t, tree, tree, tree, tree,
-				     tree, tree, bool);
 extern tree c_finish_cilk_clauses (tree);
 extern tree c_validate_cilk_plus_loop (tree *, int *, void *);
 
@@ -1384,5 +1385,31 @@ extern tree build_cilk_spawn (location_t, tree);
 extern tree make_cilk_frame (tree);
 extern tree create_cilk_function_exit (tree, bool, bool);
 extern tree cilk_install_body_pedigree_operations (tree);
-
+extern void cilk_init_cfd (struct cilk_for_desc *);
+extern void cilk_extract_free_variables (tree, struct wrapper_data *, int);
+extern tree cilk_create_cilk_helper_decl (struct wrapper_data *);
+extern void cilk_call_graph_add_fn (tree);
+extern void cilk_outline_body (tree, tree *, struct wrapper_data *, bool *);
+extern tree cilk_check_loop_difference_type (tree);
+extern void declare_cilk_for_parms (struct cilk_for_desc *);
+extern void declare_cilk_for_vars (struct cilk_for_desc *, tree);
+extern tree cilk_loop_convert (tree, tree);
+extern tree cilk_divide_count (tree, enum tree_code, tree, bool, tree);
+extern void cilk_calc_forward_div_op (struct cilk_for_desc *, enum tree_code *,
+				      tree *);
+extern tree cilk_compute_loop_count (struct cilk_for_desc *, enum tree_code,
+				     tree, tree, tree);
+extern gimple_seq insert_cilk_for_nested_fn (struct cilk_for_desc *, tree,
+					     tree);
+extern tree cilk_compute_loop_var (struct cilk_for_desc *, tree, tree,
+				   tree (*)(location_t, enum tree_code, tree,
+					    tree, tree));
+extern void cilk_set_inclusive_and_direction (struct cilk_for_desc *);
+extern void cilk_set_iter_difftype (struct cilk_for_desc *);
+extern void cilk_set_incr_info (struct cilk_for_desc *, bool);
+extern void cilk_set_init_info (struct cilk_for_desc *);
+extern tree cilk_simplify_tree (tree);
+extern enum tree_code cilk_find_code_from_call (tree);
+extern tree cilk_tree_operand_noconv (tree);
+extern tree cilk_resolve_continue_stmts (tree *, int *, void *);
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index 179c620..351977e 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1392,6 +1392,11 @@ init_pragma (void)
 
   cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false,
 				false);
+
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    cpp_register_deferred_pragma (parse_in, "cilk", "grainsize",
+				  PRAGMA_CILK_GRAINSIZE, true, false);
+
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, "pack", handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 5379b9e..ca8b190 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -55,6 +55,9 @@ typedef enum pragma_kind {
   /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
   PRAGMA_CILK_SIMD,
 
+  /* This pragma handles setting of grainsize for a _Cilk_for.  */
+  PRAGMA_CILK_GRAINSIZE,
+  
   PRAGMA_GCC_PCH_PREPROCESS,
   PRAGMA_IVDEP,
 
diff --git a/gcc/c-family/cilk.c b/gcc/c-family/cilk.c
index f6d7dce..38362f9
--- a/gcc/c-family/cilk.c
+++ b/gcc/c-family/cilk.c
@@ -34,47 +34,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic.h"
 #include "cilk.h"
 
-enum add_variable_type {
-    /* Reference to previously-defined variable.  */
-    ADD_READ,
-    /* Definition of a new variable in inner-scope.  */
-    ADD_BIND,
-    /* Write to possibly previously-defined variable.  */
-    ADD_WRITE
-};
-
-enum cilk_block_type {
-    /* Indicates a _Cilk_spawn block.  30 was an arbitary number picked for 
-       ease of debugging.  */
-    CILK_BLOCK_SPAWN = 30,
-    /* Indicates _Cilk_for statement block.  */
-    CILK_BLOCK_FOR
-};
-
-struct wrapper_data
-{
-  /* Kind of function to be created.  */
-  enum cilk_block_type type;
-  /* Signature of helper function.  */
-  tree fntype;
-  /* Containing function.  */
-  tree context;
-  /* Disposition of all variables in the inner statement.  */
-  struct pointer_map_t *decl_map;
-  /* True if this function needs a static chain.  */
-  bool nested;
-  /* Arguments to be passed to wrapper function, currently a list.  */
-  tree arglist;
-  /* Argument types, a list.  */
-  tree argtypes;
-  /* Incoming parameters.  */
-  tree parms;
-  /* Outer BLOCK object.  */
-  tree block;
-};
-
-static void extract_free_variables (tree, struct wrapper_data *,
-				    enum add_variable_type);
 static HOST_WIDE_INT cilk_wrapper_count;
 
 /* Marks the CALL_EXPR or FUNCTION_DECL, FCALL, as a spawned function call
@@ -155,8 +114,8 @@ pop_cfun_to (tree outer)
 /* This function does whatever is necessary to make the compiler emit a newly 
    generated function, FNDECL.  */
 
-static void
-call_graph_add_fn (tree fndecl)
+void
+cilk_call_graph_add_fn (tree fndecl)
 {
   const tree outer = current_function_decl;
   struct function *f = DECL_STRUCT_FUNCTION (fndecl);
@@ -282,8 +241,8 @@ cilk_detect_spawn_and_unwrap (tree *exp0)
 /* This function will build and return a FUNCTION_DECL using information 
    from *WD.  */
 
-static tree
-create_cilk_helper_decl (struct wrapper_data *wd)
+tree
+cilk_create_cilk_helper_decl (struct wrapper_data *wd)
 {
   char name[20];
   if (wd->type == CILK_BLOCK_FOR)
@@ -451,6 +410,8 @@ for_local_cb (const void *k_v, void **vp, void *p)
   tree k = *(tree *) &k_v;
   tree v = (tree) *vp;
 
+  if (k == v)
+    return true;
   if (v == error_mark_node)
     *vp = copy_decl_no_change (k, (copy_body_data *) p);
   return true;
@@ -471,15 +432,18 @@ wrapper_local_cb (const void *k_v, void **vp, void *data)
   return true;
 }
 
-/* Alter a tree STMT from OUTER_FN to form the body of INNER_FN.  */
+/* Alter a tree STMT from OUTER_FN to form the body of INNER_FN.  THR is set
+   to true if the original function has exception enabled (only applicable for
+   C++ Cilk_for nested function).  This value is evaluated and then
+   passed back into cp_function_tree->can_throw.  */
 
-static void
-cilk_outline (tree inner_fn, tree *stmt_p, struct wrapper_data *wd)
+void
+cilk_outline_body (tree inner_fn, tree *stmt_p, struct wrapper_data *wd,
+		   bool *thr)
 {
   const tree outer_fn = wd->context;	      
   const bool nested = (wd->type == CILK_BLOCK_FOR);
   copy_body_data id;
-  bool throws;
 
   DECL_STATIC_CHAIN (outer_fn) = 1;
 
@@ -495,7 +459,7 @@ cilk_outline (tree inner_fn, tree *stmt_p, struct wrapper_data *wd)
   id.retvar = 0; 
   id.decl_map = wd->decl_map;
   id.copy_decl = nested ? copy_decl_no_change : copy_decl_for_cilk;
-  id.block = DECL_INITIAL (inner_fn);
+  id.block = 0; 
   id.transform_lang_insert_block = NULL;
 
   id.transform_new_cfg = true;
@@ -514,8 +478,10 @@ cilk_outline (tree inner_fn, tree *stmt_p, struct wrapper_data *wd)
   /* See if this function can throw or calls something that should
      not be spawned.  The exception part is only necessary if
      flag_exceptions && !flag_non_call_exceptions.  */
-  throws = false ;
+  bool throws = thr ? *thr : false;
   (void) walk_tree_without_duplicates (stmt_p, check_outlined_calls, &throws);
+  if (thr)
+    *thr = throws;
 }
 
 /* Generate the body of a wrapper function that assigns the
@@ -537,7 +503,7 @@ create_cilk_wrapper_body (tree stmt, struct wrapper_data *wd)
   /* Emit a function that takes WRAPPER_PARMS incoming and applies ARGS 
      (modified) to the wrapped function.  Return the wrapper and modified ARGS 
      to the caller to generate a function call.  */
-  fndecl = create_cilk_helper_decl (wd);
+  fndecl = cilk_create_cilk_helper_decl (wd);
   push_struct_function (fndecl);
   if (wd->nested && (wd->type == CILK_BLOCK_FOR))
     {
@@ -550,7 +516,7 @@ create_cilk_wrapper_body (tree stmt, struct wrapper_data *wd)
   for (p = wd->parms; p; p = TREE_CHAIN (p))
     DECL_CONTEXT (p) = fndecl;
 
-  cilk_outline (fndecl, &stmt, wd);
+  cilk_outline_body (fndecl, &stmt, wd, NULL);
   stmt = fold_build_cleanup_point_expr (void_type_node, stmt);
   gcc_assert (!DECL_SAVED_TREE (fndecl));
   lang_hooks.cilkplus.install_body_with_frame_cleanup (fndecl, stmt);
@@ -559,7 +525,7 @@ create_cilk_wrapper_body (tree stmt, struct wrapper_data *wd)
   pop_cfun_to (outer);
 
   /* Recognize the new function.  */
-  call_graph_add_fn (fndecl);
+  cilk_call_graph_add_fn (fndecl);
   return fndecl;
 }
 
@@ -701,14 +667,14 @@ create_cilk_wrapper (tree exp, tree *args_out)
      by spawn and the variable must remain in the outer function.  */
   if (TREE_CODE (exp) == INIT_EXPR)
     {
-      extract_free_variables (TREE_OPERAND (exp, 0), &wd, ADD_WRITE);
-      extract_free_variables (TREE_OPERAND (exp, 1), &wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (exp, 0), &wd, ADD_WRITE);
+      cilk_extract_free_variables (TREE_OPERAND (exp, 1), &wd, ADD_READ);
       /* TREE_TYPE should be void.  Be defensive.  */
       if (TREE_TYPE (exp) != void_type_node)
-	extract_free_variables (TREE_TYPE (exp), &wd, ADD_READ);
+	cilk_extract_free_variables (TREE_TYPE (exp), &wd, ADD_READ);
     }
   else
-    extract_free_variables (exp, &wd, ADD_READ);
+    cilk_extract_free_variables (exp, &wd, ADD_READ);
   pointer_map_traverse (wd.decl_map, declare_one_free_variable, &wd);
   wd.block = TREE_BLOCK (exp);
   if (!wd.block)
@@ -995,14 +961,14 @@ add_variable (struct wrapper_data *wd, tree var, enum add_variable_type how)
 
 /* Find the variables referenced in an expression T.  This does not avoid 
    duplicates because a variable may be read in one context and written in 
-   another.  HOW describes the context in which the reference is seen.  If 
+   another.  HOW_T describes the context in which the reference is seen.  If 
    NESTED is true a nested function is being generated and variables in the 
    original context should not be remapped.  */
 
-static void
-extract_free_variables (tree t, struct wrapper_data *wd,
-			enum add_variable_type how)
+void
+cilk_extract_free_variables (tree t, struct wrapper_data *wd, int how_t)
 {  
+  enum add_variable_type how = (enum add_variable_type) how_t;
   if (t == NULL_TREE)
     return;
 
@@ -1010,7 +976,7 @@ extract_free_variables (tree t, struct wrapper_data *wd,
   bool is_expr = IS_EXPR_CODE_CLASS (TREE_CODE_CLASS (code));
 
   if (is_expr)
-    extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
+    cilk_extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
 
   switch (code)
     {
@@ -1030,7 +996,7 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 
     case SSA_NAME:
       /* Currently we don't see SSA_NAME.  */
-      extract_free_variables (SSA_NAME_VAR (t), wd, how);
+      cilk_extract_free_variables (SSA_NAME_VAR (t), wd, how);
       return;
 
     case LABEL_DECL:
@@ -1052,12 +1018,12 @@ extract_free_variables (tree t, struct wrapper_data *wd,
     case NON_LVALUE_EXPR:
     case CONVERT_EXPR:
     case NOP_EXPR:
-      extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
       return;
 
     case INIT_EXPR:
-      extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_BIND);
-      extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_BIND);
+      cilk_extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
       return;
 
     case MODIFY_EXPR:
@@ -1066,8 +1032,8 @@ extract_free_variables (tree t, struct wrapper_data *wd,
     case POSTDECREMENT_EXPR:
     case POSTINCREMENT_EXPR:
       /* These write their result.  */
-      extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_WRITE);
-      extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_WRITE);
+      cilk_extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
       return;
 
     case ADDR_EXPR:
@@ -1078,9 +1044,9 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	 be addressable, and marking it modified will cause a spurious
 	 warning about writing the control variable.  */
       if (wd->type != CILK_BLOCK_SPAWN)
-	extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
       else 
-	extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_WRITE);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_WRITE);
       return;
 
     case ARRAY_REF:
@@ -1092,17 +1058,17 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	 is being accessed here.  As for ADDR_EXPR, don't do this
 	 in a nested loop, unless the access is to a fixed index.  */
       if (wd->type != CILK_BLOCK_FOR || TREE_CONSTANT (TREE_OPERAND (t, 1)))
-	extract_free_variables (TREE_OPERAND (t, 0), wd, how);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, how);
       else
-	extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
-      extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
-      extract_free_variables (TREE_OPERAND (t, 2), wd, ADD_READ);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 2), wd, ADD_READ);
       return;
 
     case TREE_LIST:
-      extract_free_variables (TREE_PURPOSE (t), wd, ADD_READ);
-      extract_free_variables (TREE_VALUE (t), wd, ADD_READ);
-      extract_free_variables (TREE_CHAIN (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_PURPOSE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_VALUE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_CHAIN (t), wd, ADD_READ);
       return;
 
     case TREE_VEC:
@@ -1110,7 +1076,7 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	int len = TREE_VEC_LENGTH (t);
 	int i;
 	for (i = 0; i < len; i++)
-	  extract_free_variables (TREE_VEC_ELT (t, i), wd, ADD_READ);
+	  cilk_extract_free_variables (TREE_VEC_ELT (t, i), wd, ADD_READ);
 	return;
       }
 
@@ -1118,13 +1084,13 @@ extract_free_variables (tree t, struct wrapper_data *wd,
       {
 	unsigned ii = 0;
 	for (ii = 0; ii < VECTOR_CST_NELTS (t); ii++)
-	  extract_free_variables (VECTOR_CST_ELT (t, ii), wd, ADD_READ); 
+	  cilk_extract_free_variables (VECTOR_CST_ELT (t, ii), wd, ADD_READ); 
 	break;
       }
 
     case COMPLEX_CST:
-      extract_free_variables (TREE_REALPART (t), wd, ADD_READ);
-      extract_free_variables (TREE_IMAGPART (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_REALPART (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_IMAGPART (t), wd, ADD_READ);
       return;
 
     case BIND_EXPR:
@@ -1135,11 +1101,11 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	    add_variable (wd, decl, ADD_BIND);
 	    /* A self-referential initialization is no problem because
 	       we already entered the variable into the map as local.  */
-	    extract_free_variables (DECL_INITIAL (decl), wd, ADD_READ);
-	    extract_free_variables (DECL_SIZE (decl), wd, ADD_READ);
-	    extract_free_variables (DECL_SIZE_UNIT (decl), wd, ADD_READ);
+	    cilk_extract_free_variables (DECL_INITIAL (decl), wd, ADD_READ);
+	    cilk_extract_free_variables (DECL_SIZE (decl), wd, ADD_READ);
+	    cilk_extract_free_variables (DECL_SIZE_UNIT (decl), wd, ADD_READ);
 	  }
-	extract_free_variables (BIND_EXPR_BODY (t), wd, ADD_READ);
+	cilk_extract_free_variables (BIND_EXPR_BODY (t), wd, ADD_READ);
 	return;
       }
 
@@ -1147,17 +1113,17 @@ extract_free_variables (tree t, struct wrapper_data *wd,
       {
 	tree_stmt_iterator i;
 	for (i = tsi_start (t); !tsi_end_p (i); tsi_next (&i))
-	  extract_free_variables (*tsi_stmt_ptr (i), wd, ADD_READ);
+	  cilk_extract_free_variables (*tsi_stmt_ptr (i), wd, ADD_READ);
 	return;
       }
 
     case TARGET_EXPR:
       {
-	extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_BIND);
-	extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
-	extract_free_variables (TREE_OPERAND (t, 2), wd, ADD_READ);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_BIND);
+	cilk_extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
+	cilk_extract_free_variables (TREE_OPERAND (t, 2), wd, ADD_READ);
 	if (TREE_OPERAND (t, 3) != TREE_OPERAND (t, 1))
-	  extract_free_variables (TREE_OPERAND (t, 3), wd, ADD_READ);
+	  cilk_extract_free_variables (TREE_OPERAND (t, 3), wd, ADD_READ);
 	return;
       }
 
@@ -1171,32 +1137,32 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 
     case DECL_EXPR:
       if (TREE_CODE (DECL_EXPR_DECL (t)) != TYPE_DECL)
-	extract_free_variables (DECL_EXPR_DECL (t), wd, ADD_BIND);
+	cilk_extract_free_variables (DECL_EXPR_DECL (t), wd, ADD_BIND);
       return;
 
     case INTEGER_TYPE:
     case ENUMERAL_TYPE:
     case BOOLEAN_TYPE:
-      extract_free_variables (TYPE_MIN_VALUE (t), wd, ADD_READ);
-      extract_free_variables (TYPE_MAX_VALUE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_MIN_VALUE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_MAX_VALUE (t), wd, ADD_READ);
       return;
 
     case POINTER_TYPE:
-      extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
       break;
 
     case ARRAY_TYPE:
-      extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
-      extract_free_variables (TYPE_DOMAIN (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_DOMAIN (t), wd, ADD_READ);
       return;
 
     case RECORD_TYPE:
-      extract_free_variables (TYPE_FIELDS (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_FIELDS (t), wd, ADD_READ);
       return;
     
     case METHOD_TYPE:
-      extract_free_variables (TYPE_ARG_TYPES (t), wd, ADD_READ);
-      extract_free_variables (TYPE_METHOD_BASETYPE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_ARG_TYPES (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_METHOD_BASETYPE (t), wd, ADD_READ);
       return;
 
     case AGGR_INIT_EXPR:
@@ -1209,8 +1175,8 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	    len = TREE_INT_CST_LOW (TREE_OPERAND (t, 0));
 
 	    for (ii = 0; ii < len; ii++)
-	      extract_free_variables (TREE_OPERAND (t, ii), wd, ADD_READ);
-	    extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
+	      cilk_extract_free_variables (TREE_OPERAND (t, ii), wd, ADD_READ);
+	    cilk_extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
 	  }
 	break;
       }
@@ -1226,7 +1192,7 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	  /* Go through the subtrees.  We need to do this in forward order so
 	     that the scope of a FOR_EXPR is handled properly.  */
 	  for (i = 0; i < len; ++i)
-	    extract_free_variables (TREE_OPERAND (t, i), wd, ADD_READ);
+	    cilk_extract_free_variables (TREE_OPERAND (t, i), wd, ADD_READ);
 	}
     }
 }
@@ -1303,3 +1269,872 @@ build_cilk_sync (void)
   TREE_SIDE_EFFECTS (sync) = 1;
   return sync;
 }
+
+/* Zeros out all the fields in CFD.  */
+
+void
+cilk_init_cfd (struct cilk_for_desc *cfd)
+{
+  memset (cfd, 0, sizeof *cfd);
+  init_wd (&cfd->wd, CILK_BLOCK_FOR);
+}
+
+/* Returns a CALL_EXPR based on the TYPE_PRECISON of COUNT_TYPE.  */
+
+static tree
+find_cilk_for_library_fn (tree count_type)
+{
+  if (TYPE_PRECISION (count_type) == 32)
+    return cilk_for_32_fndecl;
+  else if (TYPE_PRECISION (count_type) == 64)
+    return cilk_for_64_fndecl;
+  else
+    gcc_unreachable ();
+}
+
+/* This function finds the direction INCR, the increment expression, of the
+   loop: Return 0 if the sign of INCR_DIRECTION is unknown,
+   +1 if the value is exactly +1,
+   +2 if the value is known to be positive, and
+   -2 if the value is known to be negative.  */
+
+static int
+cilk_compute_incr_direction (tree incr)
+{
+  if (TREE_CODE (incr) != INTEGER_CST)
+    return tree_expr_nonnegative_p (incr) ? 2 : 0;
+  else if (integer_onep (incr))
+    return 1;
+  else
+    return 2 * tree_int_cst_sgn (incr);
+}
+
+/* Return the count type based on TYPE of a Cilk for loop, or unsigned long if
+   there is no acceptable type.  */
+
+tree
+cilk_check_loop_difference_type (tree type)
+{
+  if ((TYPE_PRECISION (type) > TYPE_PRECISION (long_unsigned_type_node))
+      || (TYPE_MAIN_VARIANT (type) == long_long_integer_type_node)
+      || (TYPE_MAIN_VARIANT (type) == long_long_unsigned_type_node))
+    return long_long_unsigned_type_node;
+
+  return long_unsigned_type_node;
+}
+
+/* Removes unwanted wrappers from a tree, T.  */
+
+tree
+cilk_simplify_tree (tree t)
+{
+  extern tree tree_ssa_strip_useless_type_conversions (tree);
+
+  if (TREE_CODE (t) == CLEANUP_POINT_EXPR)
+    t = TREE_OPERAND (t, 0);
+  if (TREE_CODE (t) == NOP_EXPR)
+    t = TREE_OPERAND (t, 0);
+  if ((TREE_CODE (t) == CONVERT_EXPR) && (VOID_TYPE_P (TREE_TYPE (t)) != 0))
+    t = TREE_OPERAND (t, 0);
+
+  STRIP_USELESS_TYPE_CONVERSION (t);
+
+  return t;
+}
+
+/* Set up the variable mapping for the FNDECL and install parameters after
+   declaring the function and scanning the loop body's variable use.
+   Information about the _Cilk_for statement is stored in *CFD.  */
+
+void
+declare_cilk_for_vars (struct cilk_for_desc *cfd, tree fndecl)
+{
+  tree var2 = build_decl (cfd->loc, VAR_DECL, DECL_NAME (cfd->var),
+                     cfd->var_type);
+  DECL_CONTEXT (var2) = fndecl;
+  cfd->var2 = var2;
+
+  void **mapped = pointer_map_contains (cfd->wd.decl_map, cfd->var);
+  /* The loop control variable must be mapped.  */
+  gcc_assert (mapped);
+  const_tree t = (const_tree) *mapped;
+
+  /* The loop control variable may appear as mapped to itself
+     or mapped to integer_one_node depending on its type and
+     how it was modified.  */
+  if ((TREE_CODE (t) != INTEGER_CST) || (t == integer_one_node))
+    {
+      tree save_function = current_function_decl;
+      current_function_decl = DECL_CONTEXT (cfd->var);
+      warning (0, "loop body modifies control variable %qD", cfd->var);
+      current_function_decl = save_function;
+    }
+  *mapped = (void *) var2;
+
+  tree p = cfd->wd.parms;
+  DECL_ARGUMENTS (fndecl) = p;
+  do
+    {
+      DECL_CONTEXT (p) = fndecl;
+      p = TREE_CHAIN (p);
+    }
+  while (p);
+}
+
+/* Set up the signature and parameters of the _Cilk_for body function
+   before declaring the function using information stored in CFD.  */
+
+void
+declare_cilk_for_parms (struct cilk_for_desc *cfd)
+{
+  tree count_type = cfd->count_type;
+  tree ro_count = build_qualified_type (count_type, TYPE_QUAL_CONST);
+  tree ctx = build_decl (cfd->loc, PARM_DECL, NULL_TREE, ptr_type_node);
+  tree t1 = get_identifier ("__low");
+  tree min_parm = build_decl (cfd->loc, PARM_DECL, t1, ro_count);
+  tree t2 = get_identifier ("__high");
+  tree max_parm = build_decl (cfd->loc, PARM_DECL, t2, ro_count);
+
+  DECL_ARG_TYPE (max_parm) = count_type;
+  DECL_ARTIFICIAL (max_parm) = 1;
+  TREE_READONLY (max_parm) = 1;
+
+  DECL_ARG_TYPE (min_parm) = count_type;
+  DECL_ARTIFICIAL (min_parm) = 1;
+  TREE_READONLY (min_parm) = 1;
+
+  DECL_ARG_TYPE (ctx) = ptr_type_node;
+  DECL_ARTIFICIAL (ctx) = 1;
+  TREE_READONLY (ctx) = 1;
+
+  TREE_CHAIN (min_parm) = max_parm;
+  TREE_CHAIN (ctx) = min_parm;
+
+  tree types = tree_cons (NULL_TREE, TREE_TYPE (max_parm), void_list_node);
+  types = tree_cons (NULL_TREE, TREE_TYPE (min_parm), types);
+  types = tree_cons (NULL_TREE, TREE_TYPE (ctx), types);
+
+  cfd->min_parm = min_parm;
+  cfd->max_parm = max_parm;
+  cfd->wd.argtypes = types;
+  cfd->wd.arglist = NULL_TREE;
+  cfd->wd.parms = ctx;
+}
+
+/* Convert a loop, EXP, to the way required by _Cilk_for and sets it type as
+   indicated by TYPE.  */
+
+tree
+cilk_loop_convert (tree type, tree exp)
+{
+  enum tree_code code;
+  int inprec, outprec;
+  if (type == TREE_TYPE (exp))
+    return exp;
+  inprec = TYPE_PRECISION (TREE_TYPE (exp));
+  outprec = TYPE_PRECISION (type);
+  if (outprec > inprec && !TYPE_UNSIGNED (TREE_TYPE (exp)))
+    code = CONVERT_EXPR;
+  else
+    code = NOP_EXPR;
+  return fold_build1 (code, type, exp);
+}
+
+/* Returns the number of times a loop is divided.  */
+
+tree
+cilk_divide_count (tree count, enum tree_code op, tree incr, bool negate,
+		   tree type)
+{
+  tree dtype;
+
+  if (!count)
+    return NULL_TREE;
+
+  tree ctype = TREE_TYPE (count);
+  tree itype = TREE_TYPE (incr);
+
+  if (op == NOP_EXPR && !negate)
+    return cilk_loop_convert (type, count);
+  /* Return -(unsigned) count instead of (unsigned)-count in case the negate
+     overflows.  */
+  if (op == NOP_EXPR && negate)
+    return fold_build1 (NEGATE_EXPR, type, cilk_loop_convert (type, count));
+
+  /* We are dividing two positive values or else the user has invoked
+     undefined behavior.  That means we can divide in a common narrow
+     type and widen after.  This does not work if we must negate signed
+     INCR to get a positive value because we could be negating INT_MIN.  */
+
+  if (ctype != itype || (negate && !TYPE_UNSIGNED (itype)))
+    {
+      incr = cilk_loop_convert (type, incr);
+      count = cilk_loop_convert (type, count);
+      dtype = type;
+    }
+  else
+    dtype = ctype;
+
+  if (negate)
+    incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (incr), incr);
+
+  count = fold_build2 (op, dtype, count, incr);
+  if (dtype != type)
+    count = cilk_loop_convert (type, count);
+
+  return count;
+}
+
+/* Sets *DIV_OP to the appropriate operation to divide the loop and
+   the *FORWARD tree with condition expression based on DIRECTION, INCR_SIGN
+   and EXACTLY_ONE.  */
+
+void
+cilk_calc_forward_div_op (struct cilk_for_desc *cfd, enum tree_code *div_op,
+			  tree *forward)
+{
+  switch (cfd->direction)
+    {
+    case -2:
+      *forward = boolean_false_node;
+      *div_op = CEIL_DIV_EXPR;
+      break;
+    case -1:
+      *forward = boolean_false_node;
+      *div_op = EXACT_DIV_EXPR;
+      break;
+    case 0:
+      *forward = build2 (cfd->incr_sign > 0 ? GE_EXPR : LT_EXPR,
+			 boolean_type_node, cfd->incr, integer_zero_node);
+      /* Loops with indeterminate direction use != and are always exact.  */
+      *div_op = EXACT_DIV_EXPR;
+      break;
+    case 1:
+      *forward = boolean_true_node;
+      *div_op = EXACT_DIV_EXPR;
+      break;
+    case 2:
+      *forward = boolean_true_node;
+      *div_op = CEIL_DIV_EXPR;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  if (cfd->exactly_one)
+    *div_op = NOP_EXPR;
+}
+
+/* Returns the loop-count based on the _Cilk_for loop's characteristics given
+   in *CFD.  DIV_OP indicates whether we have exact division or a CEILING
+   operation need to be performed.  COUNT_UP and COUNT_DOWN are not
+   NULL_TREE if the increment and decrement operation are done using an
+   iterator.  */
+
+tree
+cilk_compute_loop_count (struct cilk_for_desc *cfd, enum tree_code div_op,
+			 tree forward, tree count_up, tree count_down)
+{
+  /* if initial value is not given, the use the variable since it holds the
+     lower value.  */
+  tree low = cfd->lower_bound ? cfd->lower_bound : cfd->var;
+
+  /* Same logic as low for high variable.  */
+  tree high = cfd->end_var ? cfd->end_var : cfd->end_expr;
+
+  if (low == error_mark_node || high == error_mark_node)
+    {
+      gcc_assert (errorcount || sorrycount);
+      return error_mark_node;
+    }
+
+  /* If either count_up or count_down are not NULL, then it is an indication
+     that we have an interator for loop computation, so we check if
+     cfd->iterator is set to true.  */
+  if (count_up != NULL_TREE || count_down != NULL_TREE)
+    gcc_assert (cfd->iterator);
+  else
+    {
+      tree low_type = TREE_TYPE (low);
+      tree high_type = TREE_TYPE (high);
+      tree sub_type = NULL_TREE;
+
+      if (TREE_CODE (TREE_TYPE (cfd->var)) == POINTER_TYPE)
+	sub_type = ptrdiff_type_node;
+      else
+	{
+	  /* We need to compute HIGH - LOW or LOW - HIGH without overflow.  */
+	  sub_type = common_type (low_type, high_type);
+
+	  /* If subtracting two signed vars. without widening then convert them
+	     to unsigned.  */
+	  if (!TYPE_UNSIGNED (sub_type)
+	      && (TYPE_PRECISION (sub_type) == TYPE_PRECISION (low_type)
+		  || TYPE_PRECISION (sub_type) == TYPE_PRECISION (high_type)))
+	    sub_type = unsigned_type_for (sub_type);
+	}
+      if (low_type != sub_type)
+	low = convert (sub_type, low);
+      if (high_type != sub_type)
+	high = convert (sub_type, high);
+
+      if (cfd->direction <= 0)
+	count_down = fold_build2 (MINUS_EXPR, sub_type, low, high);
+      if (cfd->direction >= 0)
+	count_up = fold_build2 (MINUS_EXPR, sub_type, high, low);
+    }
+
+  /* if the loop is not exact add one before dividing.  Otherwise add 1 after
+     dividing.  Assumed that it can't overflow (meaning that loop range cannot
+     exceed the range of the loop variable or difference type).  */
+  if (cfd->inclusive && div_op == CEIL_DIV_EXPR)
+    {
+      if (count_up)
+	count_up = fold_build2 (PLUS_EXPR, TREE_TYPE (count_up), count_up,
+				build_one_cst (TREE_TYPE (count_up)));
+      if (count_down)
+	count_down = fold_build2 (PLUS_EXPR, TREE_TYPE (count_down), count_down,
+				  build_one_cst (TREE_TYPE (count_down)));
+    }
+
+  /* Serial semantics: INCR is converted to the common type of VAR and INCR then
+     the result is converted to the type of VAR.  */
+  tree incr = cfd->incr;
+  if (!cfd->iterator && TREE_CODE (TREE_TYPE (cfd->var)) != POINTER_TYPE)
+    incr = cilk_loop_convert (common_type (TREE_TYPE (cfd->var),
+					   TREE_TYPE (incr)), incr);
+
+  /* Now separately divide each count by +/- INCR yielding a value with type
+     TYPE.  */
+  count_up = cilk_divide_count (count_up, div_op, incr, cfd->incr_sign < 0,
+				cfd->count_type);
+  count_down = cilk_divide_count (count_down, div_op, incr, cfd->incr_sign > 0,
+				  cfd->count_type);
+  /* Merge forward and backward counts.  */
+  tree count = NULL_TREE;
+  if (!count_up)
+    count = count_down;
+  else if (!count_down)
+    count = count_up;
+  else
+    count = fold_build3 (COND_EXPR, cfd->count_type, forward, count_up,
+			 count_down);
+
+  /* Add one, maybe.  */
+  if (cfd->inclusive && div_op != CEIL_DIV_EXPR)
+    count = fold_build2 (PLUS_EXPR, cfd->count_type, count,
+			 build_one_cst (cfd->count_type));
+
+  return count;
+}
+
+/* Returns a GIMPLE_SEQ that contains a call to the Cilk library function and
+   the necessary temporary variables.  COUNT and FN are parameters to the
+   library function indicating the loop-count and nested function,
+   respectively.  */
+
+gimple_seq
+insert_cilk_for_nested_fn (struct cilk_for_desc *cfd, tree count, tree fn)
+{
+  /* INNER_SEQ contains evaluation of variables holding loop increment and
+     count.  These are evaluated inside the loop guard.  */
+  gimple_seq inner_seq = 0;
+  if (!TREE_CONSTANT (count))
+    {
+      count = fold_build_cleanup_point_expr (TREE_TYPE (count), count);
+      count = get_formal_tmp_var (count, &inner_seq);
+    }
+
+  if (TREE_SIDE_EFFECTS (cfd->incr))
+    cfd->incr = get_formal_tmp_var (cfd->incr, &inner_seq);
+
+  tree libfun = find_cilk_for_library_fn (cfd->count_type);
+  tree ctx = cfd->ctx_arg;
+  if (ctx)
+    {
+      if (TREE_TYPE (ctx) != ptr_type_node)
+	ctx = fold_build1 (NOP_EXPR, ptr_type_node, ctx);
+      if (!DECL_P (ctx))
+	ctx = get_formal_tmp_var (ctx, &inner_seq);
+    }
+  else
+    {
+      ctx = fold_build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (fn)), fn);
+      ctx = get_formal_tmp_var (ctx, &inner_seq);
+    }
+  fn = fold_build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (fn)), fn);
+  TREE_CONSTANT (fn) = 1;
+  fn = get_formal_tmp_var (fn, &inner_seq);
+
+  tree grain = cfd->grain;
+  tree tmv_count_type = TYPE_MAIN_VARIANT (cfd->count_type);
+  if (!grain)
+    grain = get_formal_tmp_var (build_zero_cst (cfd->count_type), &inner_seq);
+  else if (TYPE_MAIN_VARIANT (TREE_TYPE (grain)) != tmv_count_type)
+    grain = convert (cfd->count_type, grain);
+
+  tree libfun_call = build_call_expr (libfun, 4, fn, ctx, count, grain);
+  gimplify_and_add (libfun_call, &inner_seq);
+  return inner_seq;
+}
+
+/* The loop function looks like
+   
+   body (void *, unsigned long max, unsigned long min)
+   const T start = [outer_context] var;
+   T var';
+   for (unsigned long i = min; i < max; i++) {
+     var' = start + (T) i * (T) incr;
+     body (var');
+   }
+
+   COMPUTE_LOOP_VAR returns an expression for
+   var' = start + i * incr;
+   or
+   var' = start - i * decr;
+   with suitable type conversions.
+
+   If direction is known we know the sign of INCR (or else it's
+   undefined behavior) and we can work with positive unsigned
+   numbers until the last addition or subtraction.
+
+   If direction is not known then the increment and loop variable
+   are signed but the product of the loop count and increment may
+   not be representable as a signed value.
+
+   We can't do the last addition or subtraction in C without
+   a conditional operation because the conversion of unsigned
+   to signed is undefined for "negative" values of the unsigned
+   number.  For now we just pretend this isn't a problem.  We
+   may fail on targets with signed overflow.
+
+   For iterator loops we require that the difference type have
+   enough range and simply pass the value to operator+ or operator-
+   based on the static direction of the loop.  Iterator loop case is
+   handled by the function passed in a function pointer, *ITER_HANDLER.
+
+   LOOP_VAR has type COUNT_TYPE.  */
+
+tree
+cilk_compute_loop_var (struct cilk_for_desc *cfd, tree loop_var,
+		       tree lower_bound,
+		       tree (*iter_handler) (location_t, enum tree_code,
+					     tree, tree, tree))
+{
+  tree count_type = NULL_TREE;
+  if (INTEGRAL_TYPE_P (TREE_TYPE (loop_var)))
+    count_type = TREE_TYPE (loop_var);
+  else
+    count_type = cfd->count_type;
+  
+   /* Compute an expression to be added or subtracted.
+
+     We want to add or subtract LOOP_VAR * INCR.  INCR may be negative.
+     If the static direction is indeterminate we don't know that at
+     compile time.  The code to convert to unsigned and multiply does
+     the right thing in the end.  For iterator loops we don't need to
+     go to that trouble, but scalar loops can have range that can not
+     be represented in the signed loop variable.  */
+  tree scaled = NULL_TREE, incr = NULL_TREE;
+  if (integer_onep (cfd->incr))
+    scaled = loop_var;
+  else
+    {
+      incr = cilk_loop_convert (count_type, cfd->incr);
+      scaled = fold_build2 (MULT_EXPR, count_type, loop_var, incr);
+    }
+
+  enum tree_code add_op = cfd->incr_sign >= 0 ? PLUS_EXPR : MINUS_EXPR;
+  if (cfd->iterator)
+    {
+      /* Convert LOOP_VAR to T3 (difference_type) so that
+         operator+(T1, T3) is preferred over operator+(T1, count_type)
+         operator+ constructs the object if it returns by value.
+         Use operator- if the user wrote -=.  */
+      if (count_type != cfd->difference_type)
+	loop_var = convert (cfd->difference_type, scaled);
+      tree low = lower_bound ? lower_bound : cfd->var;
+      if (TREE_CODE (low) == TREE_LIST)
+	low = TREE_VALUE (low);
+      tree exp = iter_handler (cfd->loc, add_op, low, loop_var, cfd->var2);
+      return exp;
+    }
+   /* The scaled count may not be representable in the type of the
+     loop variable, e.g. if the loop range is INT_MIN+1 to INT_MAX-1
+     the range does not fit in a signed int.  The sum of the lower
+     bound and the count is representable.  Do the addition or
+     subtraction in the wider type, then narrow. */
+  tree cvt_val = cilk_loop_convert (count_type, lower_bound);
+  tree adjusted = fold_build2 (add_op, count_type, cvt_val, scaled);
+  tree exp = fold_build2 (MODIFY_EXPR, void_type_node, cfd->var2,
+			  cilk_loop_convert (cfd->var_type, adjusted));
+  return exp;
+}
+
+/* Remove NOP_EXPR, ADDR_EXPR and INDIRECT_REF wrappers from EXP and
+   return.  */
+
+tree
+cilk_tree_operand_noconv (tree exp)
+{
+  tree op = exp;
+  while (TREE_CODE (op) == NOP_EXPR
+	 || TREE_CODE (op) == ADDR_EXPR
+	 || TREE_CODE (op) == INDIRECT_REF)
+    op = TREE_OPERAND (op, 0);
+  return op;
+}
+
+/* Return the TREE_CODE for an overloaded function call, FN_CALL.  */
+
+enum tree_code
+cilk_find_code_from_call (tree fn_call)
+{
+  /* Unwrap the ADDR_EXPR layer.  */
+  tree call = TREE_OPERAND (fn_call, 0);
+  call = DECL_NAME (call);
+  const char *name = IDENTIFIER_POINTER (call);
+  char op_name[2];
+  op_name[1] = name[strlen (name) - 1];
+  if (name [strlen (name) - 2] != 'r')
+    op_name[0] = name [strlen (name) - 2];
+  else
+    op_name[0] = ' ';
+
+  if (op_name[1] == '<')
+    return LT_EXPR;
+  else if (op_name[1] == '>')
+    return GT_EXPR;
+  else if (op_name[1] == '=')
+    {
+      if (op_name[0] == '<')
+	return LE_EXPR;
+      else if (op_name[0] == '>')
+	return GE_EXPR;
+      else if (op_name[0] == '!')
+	return NE_EXPR;
+      else if (op_name[0] == '=')
+	return EQ_EXPR;
+      else if (op_name[0] == '+')
+	return PLUS_EXPR;
+      else if (op_name[0] == '-')
+	return MINUS_EXPR;
+      else
+	gcc_unreachable ();
+    }
+  else if (op_name[1] == '+' && op_name[0] == '+')
+    /* This could be post or pre increment expression, but for our case
+       it really does not matter.  */
+    return POSTINCREMENT_EXPR;
+  else if (op_name[1] == '-' && op_name[0] == '-')
+    /* Same reasoning as above for decrement expression.  */
+    return POSTDECREMENT_EXPR;
+  else
+    gcc_unreachable ();
+  return NOP_EXPR;
+}
+
+/* Extracts the initial value of the initalizer for a CILK_FOR_STMT.  This
+   information is stored in CFD->LOWER_BOUND.  */
+
+void
+cilk_set_init_info (struct cilk_for_desc *cfd)
+{
+  if (!cfd->lower_bound)
+    return;
+  else if (TREE_CODE (cfd->lower_bound) == MODIFY_EXPR
+	   || TREE_CODE (cfd->lower_bound) == INIT_EXPR)
+    {
+      tree op0 = TREE_OPERAND (cfd->lower_bound, 0);
+      tree op1 = TREE_OPERAND (cfd->lower_bound, 1);
+
+      gcc_assert (op0 == cfd->var);
+      cfd->lower_bound = op1;
+    }
+}
+
+/* Sets the CFD->INCLUSIVE, CFD->END_EXPR and CFD->DIRECTION based on the
+   characteristics of the Cilk for statement. */
+
+void
+cilk_set_inclusive_and_direction (struct cilk_for_desc *cfd)
+{
+  tree cond = cfd->cond;
+  enum tree_code cond_code = TREE_CODE (cond);
+  if (cond_code == CALL_EXPR)
+    cond_code = cilk_find_code_from_call (CALL_EXPR_FN (cond));
+  
+  switch (cond_code)
+    {
+    case NE_EXPR:
+      cfd->inclusive = false;
+      cfd->direction = 0;
+      break;
+    case GE_EXPR:
+      cfd->inclusive = true;
+      cfd->direction = -2;
+      break;
+    case GT_EXPR:
+      cfd->inclusive = false;
+      cfd->direction = -2;
+      break;
+    case LE_EXPR:
+      cfd->inclusive = true;
+      cfd->direction = 2;
+      break;
+    case LT_EXPR:
+      cfd->inclusive = false;
+      cfd->direction = 2;
+      break;
+    default:
+      /* == is not allowed.  */
+      gcc_unreachable ();
+    }
+  tree limit = NULL_TREE;
+  tree arg0, arg1;
+  
+  if (TREE_CODE (cond) == CALL_EXPR)
+    {
+      arg0 = CALL_EXPR_ARG (cond, 0);
+      arg1 = CALL_EXPR_ARG (cond, 1);
+    }
+  else
+    {
+      arg0 = TREE_OPERAND (cond, 0);
+      arg1 = TREE_OPERAND (cond, 1);
+    }
+
+  if (cilk_tree_operand_noconv (arg0) == cfd->var)
+    limit = arg1;
+  else
+    {
+      /* If we are here, then we have a case like this: 10 > ii;  */
+      limit = arg0;
+      cfd->direction = -cfd->direction;
+    }	
+     
+  cfd->end_expr = limit;
+}
+
+/* Sets CFD->ITERATOR and CFD->DIFFERENCE_TYPE based on the characteristics of
+   the _Cilk_for statement.  */
+
+void
+cilk_set_iter_difftype (struct cilk_for_desc *cfd)
+{
+  tree var_type = TREE_TYPE (cfd->var);
+  gcc_assert (var_type);
+  
+  switch (TREE_CODE (var_type))
+    {
+    case POINTER_TYPE:
+      cfd->iterator = false;
+      cfd->difference_type = ptrdiff_type_node;
+      break;
+    case ENUMERAL_TYPE:
+    case BOOLEAN_TYPE:
+    case INTEGER_TYPE:
+      cfd->iterator = false;
+      cfd->difference_type = lang_hooks.types.type_promotes_to (var_type);
+      break;
+    case RECORD_TYPE:
+    case UNION_TYPE:
+      cfd->iterator = true;
+      cfd->difference_type = NULL; /* This will be set later for C++.  */
+      break;
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Populate CFD with characteristics of the increment expression.  If
+   HANDLE_PTR_MULT is set, then increment is multiplied by the size of
+   pointer.  This is necessary for C++ but not for C.  */
+
+void
+cilk_set_incr_info (struct cilk_for_desc *cfd, bool handle_ptr_mult)
+{
+  int negate_incr = 0, incr_direction = 0;
+
+  cfd->incr = cilk_simplify_tree (cfd->incr);
+  enum tree_code inc_op = TREE_CODE (cfd->incr);
+  bool is_incr = false;
+  tree op0, op1;
+  tree incr;
+  if (inc_op == ADDR_EXPR || inc_op == CALL_EXPR || inc_op == INDIRECT_REF)
+      /* This indicates that the increment operation is overloaded.  */
+    incr = cilk_tree_operand_noconv (cfd->incr);
+  else if (inc_op == TARGET_EXPR)
+    incr = TARGET_EXPR_INITIAL (cfd->incr);
+  else
+    incr = cfd->incr;
+      
+  if (TREE_CODE (incr) == CALL_EXPR)
+    {
+      inc_op = cilk_find_code_from_call (CALL_EXPR_FN (incr));
+      if (inc_op == PLUS_EXPR || inc_op == MINUS_EXPR)
+	{
+	  op1 = CALL_EXPR_ARG (incr, 1);
+	  op0 = cilk_tree_operand_noconv (CALL_EXPR_ARG (incr, 0));
+	  inc_op = (inc_op == PLUS_EXPR ? PREINCREMENT_EXPR
+		    : PREDECREMENT_EXPR);
+	}
+      else if (inc_op == POSTINCREMENT_EXPR || inc_op == POSTDECREMENT_EXPR
+	       || inc_op == PREDECREMENT_EXPR || inc_op == PREINCREMENT_EXPR)
+	op1 = integer_one_node;
+      else
+	op1 = CALL_EXPR_ARG (incr, 0);
+    }
+  else
+    op1 = TREE_OPERAND (cfd->incr, 1);
+  
+  is_incr = (inc_op == PREINCREMENT_EXPR || inc_op == POSTINCREMENT_EXPR);
+  switch (inc_op)
+    {
+    case POSTDECREMENT_EXPR:
+    case PREDECREMENT_EXPR:
+    case PREINCREMENT_EXPR:
+    case POSTINCREMENT_EXPR:
+      negate_incr = is_incr ? false : true;
+      incr_direction = is_incr ? -1 : 1;
+      cfd->incr = op1;
+      if (!cfd->incr)
+	{
+	  tree var_type = TREE_TYPE (cfd->var);
+	  if (TREE_CODE (var_type) == POINTER_TYPE)
+	    cfd->incr = size_in_bytes (TREE_TYPE (var_type));
+	  else
+	    cfd->incr = integer_one_node;
+	}
+      cfd->exactly_one = integer_onep (cfd->incr);
+      break;
+    case MODIFY_EXPR:
+      {
+	/* In here the expressions will have the form var <MODIFY_OP> incr or
+	   op = op <OPERATION> incr.  */
+	cfd->incr = (TREE_CODE (cfd->incr) != MODIFY_EXPR ? op1
+		     : TREE_OPERAND (cfd->incr, 1));
+	enum tree_code increment_code = TREE_CODE (cfd->incr);
+	if (increment_code == PLUS_EXPR || increment_code == POINTER_PLUS_EXPR)
+	  {
+	    op0 = TREE_OPERAND (cfd->incr, 0);
+	    op1 = TREE_OPERAND (cfd->incr, 1);
+
+	    if (op0 == cfd->var || DECL_NAME (op0) == DECL_NAME (cfd->var))
+	      cfd->incr = op1;
+	    else if (op1 == cfd->var || DECL_NAME (op1) == DECL_NAME (cfd->var))
+	      cfd->incr = op0;
+	    else
+	      gcc_unreachable ();
+
+	    negate_incr = false;
+	    incr_direction = cilk_compute_incr_direction (cfd->incr);
+	  
+	    /* Adding a negative number treated as unsigned is like adding a
+	       large positive number.  */
+	    if (TYPE_UNSIGNED (cfd->difference_type) && incr_direction < 0)
+	      incr_direction = 2;
+	    cfd->exactly_one = (incr_direction == 1);
+
+	    /* Don't need to do this in POINTER_PLUS_EXPR since it already
+	       does this for you.  */
+	    if (handle_ptr_mult && increment_code != POINTER_PLUS_EXPR)
+	      {
+		tree var_type = TREE_TYPE (cfd->var);
+		if (TREE_CODE (var_type) == POINTER_TYPE)
+		  {
+		    tree size = size_in_bytes (TREE_TYPE (var_type));
+		    if (!integer_onep (size))
+		      {
+			cfd->exactly_one = 0;
+			/* For example, in the following _Cilk_for statement:
+			   _Cilk_for (int *p = a, p < b; p += (char)c)
+			   We need to do the match in a type wider than c.
+			   "build_binary_op" will do the default conversions
+			   which should be enough if SIZE is size_t.  */
+			cfd->incr = build_binary_op (cfd->loc, MULT_EXPR,
+						     cfd->incr, size, 0);
+		      }
+		  }
+	      }
+	  }
+	else if (TREE_CODE (cfd->incr) == MINUS_EXPR)
+	  {
+	    op1 = TREE_OPERAND (cfd->incr, 1);
+	    op0 = TREE_OPERAND (cfd->incr, 0);
+
+	    gcc_assert (op0 == cfd->var
+			|| DECL_NAME (op0) == DECL_NAME (cfd->var));
+	    cfd->incr = op1;
+	    negate_incr = true;
+	    incr_direction = -cilk_compute_incr_direction (cfd->incr);
+
+	    /* Subtracting a negative number is treated as adding a
+	       positive.  */
+	    if (TYPE_UNSIGNED (cfd->difference_type) && incr_direction > 0)
+	      incr_direction = -2;
+	    cfd->exactly_one = (incr_direction == -1);
+
+	    /* In C++ we need to handle the pointer arithmetic manually, but
+	       in C it seem to automatically figure this out.  */
+	    if (handle_ptr_mult)
+	      {
+		tree var_type  = TREE_TYPE (cfd->var);
+		if (TREE_CODE (var_type) == POINTER_TYPE)
+		  {
+		    tree size = size_in_bytes (TREE_TYPE (var_type));
+		    if (!integer_onep (size))
+		      {
+			cfd->exactly_one = 0;
+			/* For example, in the following _Cilk_for statement:
+			   _Cilk_for (int *p = a, p < b; p += (char)c)
+			   We need to do the match in a type wider than c.
+			   "build_binary_op" will do the default conversions
+			   which should be enough if SIZE is size_t.  */
+			cfd->incr = build_binary_op (cfd->loc, MULT_EXPR,
+						     cfd->incr, size, 0);
+		      }
+		  }
+	      }
+	  }
+	else
+	  {
+	    location_t incr_loc = EXPR_LOCATION (cfd->incr);
+	    error_at (incr_loc, "invalid loop increment operation");
+	    cfd->invalid = true;
+	    return;
+	  }
+      }
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  cfd->var_type = TREE_TYPE (cfd->var);
+  cfd->incr_sign = negate_incr ? -1 : 1;
+}
+
+/* Helper function for walk_tree. Fixes up all the continues inside a
+   _Cilk_for body.  */
+
+tree
+cilk_resolve_continue_stmts (tree *tp, int *walk_subtrees, void *data)
+{
+  tree goto_label = NULL_TREE, goto_stmt = NULL_TREE;
+  if (!tp || !*tp)
+    return NULL_TREE;
+
+  if (TREE_CODE (*tp) == CONTINUE_STMT)
+    {
+      goto_label = (tree) data;
+      goto_stmt = build1 (GOTO_EXPR, void_type_node, goto_label);
+      *tp = goto_stmt;
+      *walk_subtrees = 0;
+    }
+  else if (TREE_CODE (*tp) == FOR_STMT || TREE_CODE (*tp) == WHILE_STMT
+           || TREE_CODE (*tp) == DO_STMT || TREE_CODE (*tp) == CILK_FOR_STMT)
+      /* Inside these statements, the continue goes to a different place not
+         end of cilk_for. You do not want to go into these trees because we
+         will resolve those later.  */
+    *walk_subtrees = 0;
+
+  return NULL_TREE;
+}
diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in
index d79fc4f..baa8af2 100644
--- a/gcc/c/Make-lang.in
+++ b/gcc/c/Make-lang.in
@@ -51,7 +51,7 @@ CFLAGS-c/gccspec.o += $(DRIVER_DEFINES)
 # Language-specific object files for C and Objective C.
 C_AND_OBJC_OBJS = attribs.o c/c-errors.o c/c-decl.o c/c-typeck.o \
   c/c-convert.o c/c-aux-info.o c/c-objc-common.o c/c-parser.o \
-  c/c-array-notation.o $(C_COMMON_OBJS) $(C_TARGET_OBJS)
+  c/c-array-notation.o c/c-cilk.o $(C_COMMON_OBJS) $(C_TARGET_OBJS)
 
 # Language-specific object files for C.
 C_OBJS = c/c-lang.o c-family/stub-objc.o $(C_AND_OBJC_OBJS)
diff --git a/gcc/c/c-cilk.c b/gcc/c/c-cilk.c
new file mode 100644
index 0000000..982b0de
--- /dev/null
+++ b/gcc/c/c-cilk.c
@@ -0,0 +1,359 @@
+/* This file is part of the Intel (R) Cilk (TM) Plus support
+   This file contains the functions required to handle _Cilk_for
+   for the C language.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Balaji V. Iyer <balaji.v.iyer@intel.com>,
+   Intel Corporation
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "c-tree.h"
+#include "langhooks.h"
+#include "gimple.h"
+#include "tree-iterator.h"
+#include "tree-inline.h"
+#include "c-family/c-common.h"
+#include "toplev.h"
+#include "cgraph.h"
+#include "diagnostic.h"
+#include "cilk.h"
+#include "gimplify.h"
+
+/* Get a block for the CILK_FOR_STMT, CFOR.  */
+
+static tree
+block_cilk_for_loop (tree cfor)
+{
+  tree block = tree_block (cfor);
+  if (block)
+    return block;
+  return DECL_INITIAL (current_function_decl);
+}
+
+/* Create or discover the variable to be used in the loop termination
+   condition.  Return true if the cfd->end_var should be used in the
+   guard test around the runtime call.  Otherwise the guard test uses
+   the complex expression, which in C++ may initialize the variable.
+
+   For example, if END_EXPR is
+
+   (target_expr limit (call constructor ...))
+
+   the variable limit is not initialized until the target_expr is
+   evaluated.  */
+
+static bool
+cilk_for_end (struct cilk_for_desc *cfd, gimple_seq *pre_p)
+{
+  tree end = cfd->end_expr;
+  if (TREE_SIDE_EFFECTS (end))
+    {
+      enum tree_code ecode = TREE_CODE (end);
+      if (ecode == INIT_EXPR || ecode == MODIFY_EXPR || ecode == TARGET_EXPR)
+        {
+          cfd->end_var = TREE_OPERAND (end, 0);
+          return false;
+        }
+      else
+        {
+          /* Copy the result of evaluating the expression into a variable.
+             The compiler will probably crash if there's anything
+             complicated in it -- a complicated value needs to go through
+             the other branch of this IF using an explicit temporary.  */
+          cfd->end_var = get_formal_tmp_var (end, pre_p);
+          return true;
+        }
+    }
+  cfd->end_var = end;
+  return false;
+}
+
+/* Handler for iterator to compute the loop variable.  LOW indicates the
+   starting point and LOOP_VAR is the induction variable, and VAR2 is the
+   original induction variable in the Cilk_for.  Returns an expression
+   (or a STATEMENT_LIST of expressions).  */
+
+static tree
+compute_loop_var_c_iter_hdl (location_t loc, enum tree_code add_op, tree low,
+			     tree loop_var, tree var2)
+{
+  tree exp = fold_build2 (add_op, TREE_TYPE (loop_var), low, loop_var);
+  gcc_assert (exp != error_mark_node);
+
+  exp = build_modify_expr (loc, var2, TREE_TYPE (var2), INIT_EXPR, loc, exp,
+			   TREE_TYPE (exp));
+  gcc_assert (exp != error_mark_node);
+  return exp;
+}
+
+
+/* Creates a body of the _Cilk_for wrapper function with the information
+   in *CFD.  */
+
+static tree
+create_cilk_for_body (struct cilk_for_desc *cfd)
+{
+  declare_cilk_for_parms (cfd);
+  cfd->wd.fntype = build_function_type (void_type_node, cfd->wd.argtypes);
+  tree fndecl = cilk_create_cilk_helper_decl (&cfd->wd);
+
+  tree outer = current_function_decl;
+  push_struct_function (fndecl);
+  current_function_decl = fndecl;
+
+  declare_cilk_for_vars (cfd, fndecl);
+
+  tree body = push_stmt_list ();
+  tree mod_expr = NULL_TREE, loop_var = NULL_TREE;
+  tree lower_bound = cfd->lower_bound;
+  if (!lower_bound)
+    {
+      lower_bound = cfd->var;
+      tree hack = build_decl (EXPR_LOCATION (cfd->var), VAR_DECL, NULL_TREE,
+			      TREE_TYPE (lower_bound));
+      DECL_CONTEXT (hack) = DECL_CONTEXT (lower_bound);
+      DECL_NAME (hack) = DECL_NAME (lower_bound);
+      *pointer_map_insert (cfd->wd.decl_map, hack) = lower_bound;
+      lower_bound = hack;
+    }
+  if (INTEGRAL_TYPE_P (cfd->var_type))
+    {
+      tree new_min_parm = fold_build1 (CONVERT_EXPR, cfd->var_type,
+				       cfd->min_parm);
+      loop_var = create_tmp_var (cfd->var_type, NULL);
+      location_t loc = EXPR_LOCATION (cfd->var);
+      mod_expr = build_modify_expr (loc, loop_var, cfd->var_type, NOP_EXPR,
+				    loc, new_min_parm, cfd->var_type);
+    }
+  else
+    {
+      loop_var = create_tmp_var (TREE_TYPE (cfd->min_parm), NULL);
+      mod_expr = fold_build2 (INIT_EXPR, void_type_node, loop_var,
+			      cfd->min_parm);
+    }
+  add_stmt (mod_expr);
+  tree loop_body = NULL_TREE;
+  tree new_max_parm = NULL_TREE;
+
+  if (!INTEGRAL_TYPE_P (cfd->var_type))
+    new_max_parm = cfd->max_parm;
+  else
+    new_max_parm = fold_build1 (CONVERT_EXPR, cfd->var_type, cfd->max_parm);
+
+  tree end_comp = cilk_compute_loop_var (cfd, loop_var, lower_bound,
+					 compute_loop_var_c_iter_hdl);
+  append_to_statement_list (end_comp, &loop_body);
+  append_to_statement_list (cfd->body, &loop_body);
+  loop_body = fold_build_cleanup_point_expr (void_type_node, loop_body);
+  DECL_SEEN_IN_BIND_EXPR_P (cfd->var2) = 1;
+
+  cilk_outline_body (fndecl, &loop_body, &cfd->wd, NULL);
+  
+  tree loop = push_stmt_list ();
+  /* Now create a loop with c_finish_loop.  */
+  tree incr = fold_build2 (PLUS_EXPR, TREE_TYPE (loop_var), loop_var,
+			   build_one_cst (TREE_TYPE (loop_var)));
+  incr = fold_build2 (MODIFY_EXPR, void_type_node, loop_var, incr);
+  tree cond = fold_build2 (LT_EXPR, boolean_type_node, loop_var, new_max_parm);
+  c_finish_loop (EXPR_LOCATION (cfd->var), cond, incr, loop_body, NULL_TREE,
+		 NULL_TREE, false);
+  loop = pop_stmt_list (loop);
+  add_stmt (loop);
+  body = pop_stmt_list (body);
+
+  tree block = DECL_INITIAL (fndecl);
+  BLOCK_VARS (block) = loop_var;
+  body = build3 (BIND_EXPR, void_type_node, loop_var, body, block);
+  TREE_CHAIN (loop_var) = cfd->var2;
+  if (cilk_detect_spawn_and_unwrap (&body))
+    lang_hooks.cilkplus.install_body_with_frame_cleanup (fndecl, body);
+  else
+    DECL_SAVED_TREE (fndecl) = body;
+
+  pop_cfun ();
+  current_function_decl = outer;
+  return fndecl;
+}
+
+/* Creates a wrapper function for the body of a _Cilk_for statement with the
+   information stored in *CFD.  */
+
+static tree
+create_cilk_for_wrapper (struct cilk_for_desc *cfd)
+{
+  tree old_cfd = current_function_decl;
+  tree incr = cfd->incr;
+  tree var = cfd->var;
+
+  if (POINTER_TYPE_P (TREE_TYPE (var)))
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_WRITE);
+  else
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_READ);
+
+  cilk_extract_free_variables (incr, &cfd->wd, ADD_READ);
+
+  /* Map the loop variable to integer_minus_one_node if we won't really
+     be passing it to the loop body and integer_zero_node otherwise.
+
+     If the map ends up integer_one_node then somebody wrote to the loop
+     variable and that's a user error.
+     The correct map will be installed in declare_for_loop_variables. */
+  *pointer_map_insert (cfd->wd.decl_map, var) =
+    (void *) (cfd->lower_bound ? integer_minus_one_node : integer_zero_node);
+  
+  /* Note that variables are not extracted from the loop condition
+     and increment.  They are evaluated, to the extent they are
+     evaluated, in the context containing the for loop.  */
+  cilk_extract_free_variables (cfd->body, &cfd->wd, ADD_READ);
+
+  tree fn = create_cilk_for_body (cfd);
+  DECL_UNINLINABLE (fn) = 1;
+  current_function_decl = old_cfd;
+  set_cfun (DECL_STRUCT_FUNCTION (current_function_decl));
+  cfun->is_cilk_function = 1;
+
+  /* Add the new function to the cgraph list.  */
+  cilk_call_graph_add_fn (fn);
+  return fn;
+}
+
+/* Helper function for gimplify_cilk_for.  *CFD contains all the relevant
+   information extracted from a _Cilk_for statement passed into the parent
+   function gimplify_cilk_for.  */
+   
+void
+gimplify_cilk_for_1 (struct cilk_for_desc *cfd, gimple_seq *pre_p)
+{ 
+  /* We don't have to evaluate INCR only once, but we do have
+     to evaluate it no more times than in the serial loop.
+     The naive method evaluates INCR exactly that many times
+     except if the static loop direction is indeterminate.
+
+     Storing the increment in a variable is thus mandatory
+     if cfd.direction == 0.  It is an optimization otherwise
+     and there seems no harm and some benefit in doing it.
+
+     The evaluation is on the inner statement list.  The
+     increment can not be referenced prior to the loop test.  */
+  if (TREE_SIDE_EFFECTS (cfd->incr))
+    sorry ("_Cilk_for increment with side effects");
+
+  tree cond = cfd->cond;
+  tree op0 = TREE_OPERAND (cond, 0);
+  tree op1 = TREE_OPERAND (cond, 1);
+  if (!cilk_for_end (cfd, pre_p) && cfd->end_var != cfd->end_expr)
+    {
+      if (op1 == cfd->end_expr)
+	op1 = cfd->end_var;
+      else
+	op0 = cfd->end_var;
+    }
+  cond = fold_build2 (TREE_CODE (cond), boolean_type_node, op0, op1);
+
+  tree forward = NULL_TREE;
+  
+  /* This is set to NOP_EXPR to have an initial value since we are passing in
+     an address to the function below.  */
+  enum tree_code div_op = NOP_EXPR;
+
+  cilk_calc_forward_div_op (cfd, &div_op, &forward);
+
+  tree count = cilk_compute_loop_count (cfd, div_op, forward, NULL_TREE,
+					NULL_TREE);
+  tree fn = create_cilk_for_wrapper (cfd);
+
+  /* Set condition correctly, so that the function below can use it.  */
+  cfd->cond = cond;
+  gimple_seq inner_seq = insert_cilk_for_nested_fn (cfd, count, fn);
+  gimple_seq_add_seq (pre_p, inner_seq);
+}
+
+/* Extracts all the relevant information from CFOR, a CILK_FOR_STMT tree and
+   stores them in CFD structure.  */
+
+static void
+c_extract_cilk_for_fields (struct cilk_for_desc *cfd, tree cfor)
+{
+  cfd->var = CILK_FOR_VAR (cfor);
+  cfd->cond = CILK_FOR_COND (cfor);
+  cfd->lower_bound = CILK_FOR_INIT (cfor);
+  cfd->incr = CILK_FOR_EXPR (cfor);
+  cfd->loc = EXPR_LOCATION (cfor);
+  cfd->body = CILK_FOR_BODY (cfor);
+  cfd->grain = CILK_FOR_GRAIN (cfor);
+  cfd->invalid = false;
+
+  /* This function shouldn't be setting these two variables.  */
+  cfd->ctx_arg = NULL_TREE;
+  cfd->count = NULL_TREE;
+
+  if (TREE_CODE (cfd->lower_bound) == MODIFY_EXPR
+      || TREE_CODE (cfd->lower_bound) == INIT_EXPR)
+    {
+      tree op0 = TREE_OPERAND (cfd->lower_bound, 0); 
+      tree op1 = TREE_OPERAND (cfd->lower_bound, 1);
+
+      gcc_assert (op0 == cfd->var);
+      cfd->lower_bound = op1;
+    }
+
+  cilk_set_inclusive_and_direction (cfd);  
+  cilk_set_iter_difftype (cfd);
+  
+  /* Difference type cannot be NULL_TREE here for C.  */
+  cfd->count_type = cilk_check_loop_difference_type (cfd->difference_type);
+  if (cfd->count_type == NULL_TREE)
+    {
+      cfd->invalid = true;
+      return;
+    }
+
+  cilk_set_incr_info (cfd, false);
+}  
+
+/* Main entry-point function to gimplify a cilk_for statement.  *EXPR_P should
+   be a CILK_FOR_STMT tree.  */
+
+int
+c_gimplify_cilk_for (tree *expr_p, gimple_seq *pre_p,
+		     gimple_seq *post_p ATTRIBUTE_UNUSED)
+{
+  struct cilk_for_desc cfd;
+  tree cfor_expr = *expr_p;
+  
+  cfun->is_cilk_function = 1;
+  cilk_init_cfd (&cfd);
+  cfd.wd.block = block_cilk_for_loop (cfor_expr);
+  
+  c_extract_cilk_for_fields (&cfd, cfor_expr);
+  
+  *expr_p = NULL_TREE;
+  if (cfd.invalid)
+    return GS_ERROR;
+
+  tree var = CILK_FOR_VAR (cfor_expr);
+  tree init = CILK_FOR_INIT (cfor_expr);
+  tree init_expr = fold_build2 (MODIFY_EXPR, void_type_node, var, init);
+  
+  gimplify_and_add (init_expr, pre_p);
+  gimplify_cilk_for_1 (&cfd, pre_p);
+  return GS_ALL_DONE;
+}
diff --git a/gcc/c/c-objc-common.h b/gcc/c/c-objc-common.h
index 6ae7b3e..ca7fa4a 100644
--- a/gcc/c/c-objc-common.h
+++ b/gcc/c/c-objc-common.h
@@ -114,4 +114,7 @@ along with GCC; see the file COPYING3.  If not see
 #undef  LANG_HOOKS_CILKPLUS_DETECT_SPAWN_AND_UNWRAP
 #define LANG_HOOKS_CILKPLUS_DETECT_SPAWN_AND_UNWRAP  \
   cilk_detect_spawn_and_unwrap
+
+#undef  LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR
+#define LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR c_gimplify_cilk_for
 #endif /* GCC_C_OBJC_COMMON */
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index bae9708..fc82fcc 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1165,6 +1165,8 @@ static void c_parser_switch_statement (c_parser *);
 static void c_parser_while_statement (c_parser *, bool);
 static void c_parser_do_statement (c_parser *, bool);
 static void c_parser_for_statement (c_parser *, bool);
+static void c_parser_cilk_for_statement (c_parser *, enum rid, tree);
+static void c_parser_cilk_grainsize (c_parser *parser);
 static tree c_parser_asm_statement (c_parser *);
 static tree c_parser_asm_operands (c_parser *);
 static tree c_parser_asm_goto_operands (c_parser *);
@@ -4772,6 +4774,12 @@ c_parser_statement_after_labels (c_parser *parser)
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
 	  break;
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    error_at (loc, "-fcilkplus must be enabled to use %<_Cilk_for%>");
+	  else
+	    c_parser_cilk_for_statement (parser, RID_CILK_FOR, NULL_TREE);
+	  break;
 	case RID_CILK_SYNC:
 	  c_parser_consume_token (parser);
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
@@ -9382,6 +9390,24 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       c_parser_cilk_simd_construct (parser);
       return false;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (!flag_enable_cilkplus)
+	{
+	  warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not"
+		   " enabled");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      if (context == pragma_external)
+	{
+	  error_at (c_parser_peek_token (parser)->location,
+		    "%<#pragma grainsize%> must be inside a function");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      c_parser_cilk_grainsize (parser);
+      return false;
+
     default:
       if (id < PRAGMA_FIRST_EXTERNAL)
 	{
@@ -13566,22 +13592,24 @@ c_parser_cilk_all_clauses (c_parser *parser)
    FOR_KEYWORD can be either RID_CILK_FOR or RID_FOR, for parsing
    _Cilk_for or the <#pragma simd> for loop construct respectively.
 
-   (NOTE: For now, only RID_FOR is handled).
-
+   CLAUSES_OR_GRAIN is used to pass Clauses or Grain for <#pragma simd> and
+   _Cilk_for, respectively.
+   
    For a <#pragma simd>, CLAUSES are the clauses that should have been
-   previously parsed.  If there are none, or if we are parsing a
-   _Cilk_for instead, this will be NULL.  */
+   previously parsed.  For _Cilk_for, this will be the GRAIN that user passes
+   in through a pragma.  If none is passed in, then this field is NULL.  */
    
 static void
 c_parser_cilk_for_statement (c_parser *parser, enum rid for_keyword,
-			     tree clauses)
+			     tree clauses_or_grain)
 {
   tree init, decl,  cond, stmt;
   tree block, incr, save_break, save_cont, body;
-  location_t loc;
   bool fail = false;
+  tree clauses = (for_keyword == RID_FOR) ? clauses_or_grain : NULL_TREE;
+  tree grain =  (for_keyword == RID_CILK_FOR) ? clauses_or_grain : NULL_TREE;
 
-  gcc_assert (/*for_keyword == RID_CILK_FOR || */for_keyword == RID_FOR);
+  gcc_assert (for_keyword == RID_CILK_FOR || for_keyword == RID_FOR);
 
   if (!c_parser_next_token_is_keyword (parser, for_keyword))
     {
@@ -13592,8 +13620,8 @@ c_parser_cilk_for_statement (c_parser *parser, enum rid for_keyword,
       return;
     }
 
-  loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
+  location_t loc = c_parser_peek_token (parser)->location;
 
   block = c_begin_compound_stmt (true);
 
@@ -13606,8 +13634,14 @@ c_parser_cilk_for_statement (c_parser *parser, enum rid for_keyword,
   /* Parse the initialization declaration.  */
   if (c_parser_next_tokens_start_declaration (parser))
     {
+      vec<c_token> none_clauses = vNULL;
+      c_token eof_token;
+      memset (&eof_token, 0, sizeof (eof_token));
+      eof_token.type = CPP_EOF;
+      none_clauses.safe_push (eof_token);
+      none_clauses.safe_push (eof_token);
       c_parser_declaration_or_fndef (parser, true, false, false,
-				     false, false, NULL, vNULL);
+				     false, false, NULL, none_clauses);
       decl = check_for_loop_decls (loc, flag_isoc99);
       if (decl == NULL)
 	goto error_init;
@@ -13653,8 +13687,7 @@ c_parser_cilk_for_statement (c_parser *parser, enum rid for_keyword,
   if (c_parser_next_token_is_not (parser, CPP_SEMICOLON))
     {
       location_t cond_loc = c_parser_peek_token (parser)->location;
-      struct c_expr cond_expr = c_parser_binary_expression (parser, NULL,
-							    NULL);
+      struct c_expr cond_expr = c_parser_binary_expression (parser, NULL, NULL);
 
       cond = cond_expr.value;
       cond = c_objc_common_truthvalue_conversion (cond_loc, cond);
@@ -13691,6 +13724,9 @@ c_parser_cilk_for_statement (c_parser *parser, enum rid for_keyword,
       if (for_keyword == RID_FOR)
 	c_finish_cilk_simd_loop (loc, decl, init, cond, incr, body, NULL,
 				 clauses, /*scan_body=*/true);
+      else
+	c_finish_cilk_for_loop (loc, decl, init, cond, incr, body, grain,
+				false, false);
     }
 
   stmt = c_end_compound_stmt (loc, block, true);
@@ -13699,6 +13735,48 @@ c_parser_cilk_for_statement (c_parser *parser, enum rid for_keyword,
   c_cont_label = save_cont;
 }
 
+/* This function helps parse the grainsize pragma available in the Cilkplus 
+   port. Here is the correct syntax of this pragma: 
+	    #pragma cilk grainsize = <EXP> 
+ */
+
+static void
+c_parser_cilk_grainsize (c_parser *parser)
+{
+  extern tree convert_to_integer (tree, tree);
+  
+  c_parser_consume_pragma (parser);
+
+  if (c_parser_require (parser, CPP_EQ, "expected %<=%>") != 0)
+    {
+      struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL);
+      if (g_expr.value && TREE_CODE (g_expr.value) == C_MAYBE_CONST_EXPR)
+	{
+	  error_at (input_location, "cannot convert grain to long integer.\n");
+	  c_parser_skip_to_pragma_eol (parser);
+	}   
+      else if (g_expr.value && g_expr.value != error_mark_node)
+	{
+	  c_parser_skip_to_pragma_eol (parser);
+	  c_token *token = c_parser_peek_token (parser);
+	  if (token && token->type == CPP_KEYWORD
+	      && token->keyword == RID_CILK_FOR)
+	    {
+	      tree grain = convert_to_integer (long_integer_type_node,
+					       g_expr.value);
+	      if (grain && grain != error_mark_node) 
+		c_parser_cilk_for_statement (parser, RID_CILK_FOR, grain);
+	    }
+	  else
+	    warning (0, "grainsize pragma is not followed by %<_Cilk_for%>");
+	}
+      else
+	c_parser_skip_to_pragma_eol (parser);
+    }
+  else
+    c_parser_skip_to_pragma_eol (parser);
+}
+
 /* Main entry point for parsing Cilk Plus <#pragma simd> for
    loops.  */
 
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index c4dfc3b..55f2a7a 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -684,4 +684,7 @@ extern tree c_check_omp_declare_reduction_r (tree *, int *, void *);
 extern void pedwarn_c90 (location_t, int opt, const char *, ...) ATTRIBUTE_GCC_DIAG(3,4);
 extern void pedwarn_c99 (location_t, int opt, const char *, ...) ATTRIBUTE_GCC_DIAG(3,4);
 
+/* In c-cilk.c */
+extern int c_gimplify_cilk_for (tree *, gimple_seq *, gimple_seq *);
+
 #endif /* ! GCC_C_TREE_H */
diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def
index 8634194..a279a93 100644
--- a/gcc/cilk-builtins.def
+++ b/gcc/cilk-builtins.def
@@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64")
\ No newline at end of file
diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c
index 216c7d4..92b757b 100644
--- a/gcc/cilk-common.c
+++ b/gcc/cilk-common.c
@@ -69,7 +69,6 @@ cilk_arrow (tree frame_ptr, int field_number, bool volatil)
 		   field_number, volatil);
 }
 
-
 /* This function will add FIELD of type TYPE to a defined built-in 
    structure.  *NAME is the name of the field to be added.  */
 
@@ -104,6 +103,27 @@ install_builtin (const char *name, tree fntype, enum built_in_function code,
   return fndecl;
 }
 
+/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
+
+static tree
+cilk_declare_looper (const char *name, tree type, enum built_in_function code)
+{
+  tree cb, ft, fn;
+
+  cb = build_function_type_list (void_type_node,
+                                 ptr_type_node, type, type,
+                                 NULL_TREE);
+  cb = build_pointer_type (cb);
+  ft = build_function_type_list (void_type_node,
+                                 cb, ptr_type_node, type,
+                                 integer_type_node, NULL_TREE);
+  fn = install_builtin (name, ft, code, false);
+  TREE_NOTHROW (fn) = 0;
+
+  return fn;
+}
+
+
 /* Creates and initializes all the built-in Cilk keywords functions and three
    structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker.
    Detailed information about __cilkrts_stack_frame and
@@ -267,6 +287,17 @@ cilk_init_builtins (void)
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 
 					 fptr_fun, BUILT_IN_CILK_SAVE_FP,
 					 false);
+
+  /* __cilkrts_cilk_for_32 (...);  */
+  cilk_for_32_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_32",
+					    unsigned_intSI_type_node,
+				       	    BUILT_IN_CILK_FOR_32);
+
+
+  /* __cilkrts_cilk_for_64 (...);  */
+  cilk_for_64_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_64",
+					    unsigned_intDI_type_node,
+					    BUILT_IN_CILK_FOR_64);
 }
 
 /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR.  */
diff --git a/gcc/cilk.h b/gcc/cilk.h
index 99b4d78..acdfb9c 100644
--- a/gcc/cilk.h
+++ b/gcc/cilk.h
@@ -40,6 +40,8 @@ enum cilk_tree_index  {
   CILK_TI_F_POP,                      /* __cilkrts_pop_frame (...).  */
   CILK_TI_F_RETHROW,                  /* __cilkrts_rethrow (...).  */
   CILK_TI_F_SAVE_FP,                  /* __cilkrts_save_fp_ctrl_state (...).  */
+  CILK_TI_F_LOOP_32,                  /* __cilkrts_cilk_for_32 (...).  */
+  CILK_TI_F_LOOP_64,                  /* __cilkrts_cilk_for_64 (...).  */
   /* __cilkrts_stack_frame struct fields.  */
   CILK_TI_FRAME_FLAGS,                /* stack_frame->flags.  */
   CILK_TI_FRAME_PARENT,               /* stack_frame->parent.  */
@@ -65,6 +67,140 @@ enum cilk_tree_index  {
   CILK_TI_MAX
 };
 
+enum add_variable_type {
+    /* Reference to previously-defined variable.  */
+    ADD_READ,
+    /* Definition of a new variable in inner-scope.  */
+    ADD_BIND,
+    /* Write to possibly previously-defined variable.  */
+    ADD_WRITE
+};
+
+enum cilk_block_type {
+    /* Indicates a _Cilk_spawn block.  30 was an arbitary number picked for 
+       ease of debugging.  */
+    CILK_BLOCK_SPAWN = 30,
+    /* Indicates _Cilk_for statement block.  */
+    CILK_BLOCK_FOR
+};
+
+struct wrapper_data
+{
+  /* Kind of function to be created.  */
+  enum cilk_block_type type;
+  /* Signature of helper function.  */
+  tree fntype;
+  /* Containing function.  */
+  tree context;
+  /* Disposition of all variables in the inner statement.  */
+  struct pointer_map_t *decl_map;
+  /* True if this function needs a static chain.  */
+  bool nested;
+  /* Arguments to be passed to wrapper function, currently a list.  */
+  tree arglist;
+  /* Argument types, a list.  */
+  tree argtypes;
+  /* Incoming parameters.  */
+  tree parms;
+  /* Outer BLOCK object.  */
+  tree block;
+};
+
+/* This structure holds all the important information necessary for decomposing
+   a cilk_for statement.  */
+
+struct cilk_for_desc
+{
+  /* Location of the _Cilk_for statement.  */
+  location_t loc;
+
+  /* Information about the wrapper/nested function for _Cilk_for.  */ 
+  struct wrapper_data wd;
+
+  /* Does the loop body trigger undefined behavior at runtime?  */
+  bool invalid;
+
+  /* Indicates if the parent function is a nested function (C++ only).  */
+  bool nested_ok;
+
+  /* Is the loop control variable a RECORD_TYPE?  */
+  bool iterator;
+
+  /* Does the loop range include its upper bound?  */
+  bool inclusive;
+
+  /* Does the loop control variable, after converting pointer to
+     machine address and taking into account sizeof pointed to
+     type, increment or decrement by (plus or minus) one?  */
+  bool exactly_one;
+
+  /* Is the increment stored in this structure to be added (+1)
+     or subtracted (-1)? */
+  signed char incr_sign;
+
+  /* Direction is +/-1 if the increment is known to be exactly one
+     in the user-visible units, +/-2 if the sign is known but the
+     value is not known to be one, and zero if the sign is not known
+     at compile time.  */
+  signed char direction;
+
+  /* Loop upper bound.  END_EXPR is the tree for the loop bound.
+     END_VAR is either END_EXPR or a VAR_DECL holding the stabilized
+     value, if computation of the value has side effects.  */
+  tree end_expr, end_var;
+
+  /* The originally-declared loop control variable.  */
+  tree var;
+
+  /* Lower bound of the loop if it is constant enough.
+     With a constant lower bound the loop body may not
+     need to use the static chain to compute the iterator
+     value.  */
+  tree lower_bound;
+
+  /* Several types:
+
+     The declared type of the loop control variable,
+     T1 in the cilk_for spec.
+
+     The type of the loop count and argument to loop body, currently
+     always unsigned long.  (If pointers are wider, we will need a
+     pointer-sized type.)
+
+     The static type of end, T2 in the cilk_for spec.
+
+     The difference type T3 of T1-T1, which is the same as T1 for
+     integral types.  The difference type may not be wider than the
+     count type.  For integers subtraction is done in count_type
+     in case difference_type can't hold the range.
+
+     If integral, the type of the increment is known to be no wider
+     than var_type otherwise the truncation in
+     VAR = (shorter)((longer)VAR + INCR)
+     would have been rejected.  */
+  tree var_type, count_type, difference_type;
+  tree incr;
+  tree cond;
+  /* The originally-declared body of the loop.  */
+  tree body;
+
+  /* Grainsize set by the user.  */
+  tree grain;
+
+  /* Context argument to generated function, if not (fdesc fn 1).  */
+  tree ctx_arg;
+
+  /* The number of loop iterations, in case the generated function
+     needs to know.  */
+  tree count;
+
+  /* Variables of the generated function.  */
+  tree ctx_parm, min_parm, max_parm;
+
+  /* Copy of the induction variable, but at different function context.  */
+  tree var2;
+};
+
 extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 
 #define cilk_worker_fndecl            cilk_trees[CILK_TI_F_WORKER]
@@ -77,6 +213,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 #define cilk_rethrow_fndecl           cilk_trees[CILK_TI_F_RETHROW]
 #define cilk_pop_fndecl               cilk_trees[CILK_TI_F_POP]
 #define cilk_save_fp_fndecl           cilk_trees[CILK_TI_F_SAVE_FP]
+#define cilk_for_32_fndecl            cilk_trees[CILK_TI_F_LOOP_32]
+#define cilk_for_64_fndecl            cilk_trees[CILK_TI_F_LOOP_64]
 
 #define cilk_worker_type_fndecl       cilk_trees[CILK_TI_WORKER_TYPE]
 #define cilk_frame_type_decl          cilk_trees[CILK_TI_FRAME_TYPE]
@@ -90,6 +228,7 @@ extern tree cilk_dot (tree, int, bool);
 extern void cilk_init_builtins (void);
 extern void gimplify_cilk_sync (tree *, gimple_seq *);
 extern tree cilk_call_setjmp (tree);
+
 /* Returns true if Cilk Plus is enabled and if F->cilk_frame_decl is not
    NULL_TREE.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 951d9f6..dbb05b2 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -7256,6 +7256,11 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	    }
 	  break;
 
+	case CILK_FOR_STMT:
+	  ret = (enum gimplify_status)
+	    lang_hooks.cilkplus.gimplify_cilk_for (expr_p, pre_p, post_p);
+	  break;
+
 	case CILK_SPAWN_STMT:
 	  gcc_assert 
 	    (fn_contains_cilk_spawn_p (cfun) 
diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index 411cf74..e9c46b2 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -219,11 +219,13 @@ extern bool lhd_cilk_detect_spawn (tree *);
 #define LANG_HOOKS_CILKPLUS_DETECT_SPAWN_AND_UNWRAP lhd_cilk_detect_spawn
 #define LANG_HOOKS_CILKPLUS_FRAME_CLEANUP lhd_install_body_with_frame_cleanup
 #define LANG_HOOKS_CILKPLUS_GIMPLIFY_SPAWN lhd_gimplify_expr
+#define LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR lhd_gimplify_expr
 
 #define LANG_HOOKS_CILKPLUS {			\
   LANG_HOOKS_CILKPLUS_DETECT_SPAWN_AND_UNWRAP,	\
   LANG_HOOKS_CILKPLUS_FRAME_CLEANUP,		\
-  LANG_HOOKS_CILKPLUS_GIMPLIFY_SPAWN            \
+  LANG_HOOKS_CILKPLUS_GIMPLIFY_SPAWN,           \
+  LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR         \
 }
 
 #define LANG_HOOKS_DECLS { \
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index 9539e7d..fe8e440 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -154,6 +154,11 @@ struct lang_hooks_for_cilkplus
      status, but as mentioned in a previous comment, we can't see that type 
      here, so just return an int.  */
   int (*gimplify_cilk_spawn) (tree *, gimple_seq *, gimple_seq *);
+
+  /* Function to gimplify a _Cilk_for statement.  Returns enum gimplify
+     status, but as mentioned in a previous comment, we can't see that type 
+     here, so just return an int.  */
+  int (*gimplify_cilk_for) (tree *, gimple_seq *, gimple_seq *);
 };
 
 /* Language hooks related to decls and the symbol table.  */
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk-for.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk-for.c
new file mode 100644
index 0000000..caab055
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk-for.c
@@ -0,0 +1,34 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int main(int argc, char **argv)
+{
+  char Array1[26], Array2[26];
+  char Array1_Serial[26], Array2_Serial[26];
+  
+  int ii = 0, error = 0;
+  for (ii = 0; ii < 26; ii++)  
+    { 
+      Array1[ii] = 'A'+ii;
+      Array1_Serial[ii] = 'A'+ii;
+    }
+  for (ii = 0; ii < 26; ii++)
+    {
+      Array2[ii] = 'a'+ii;
+      Array2_Serial[ii] = 'a'+ii;
+    }
+  ii = 0;
+  _Cilk_for (ii = 0 ; ii < 26; ii++) 
+    Array1[ii] = Array2[ii];
+
+  for (ii = 0; ii < 26; ii++)
+    Array1_Serial[ii] = Array2_Serial[ii];
+
+  for (ii = 0; ii < 26; ii++)  {
+    if (Array1_Serial[ii] != Array1[ii])  { 
+	error = 1; 
+    }
+  }
+  return error;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_decr.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_decr.c
new file mode 100644
index 0000000..e45b557
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_decr.c
@@ -0,0 +1,44 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus -w" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define ARRAY_SIZE 1000
+
+int a[ARRAY_SIZE];
+
+int main(void)
+{
+  int i= 0;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    a[i] = 0;
+  
+  _Cilk_for (i = (ARRAY_SIZE-1); i >= 0; i--) 
+    a[i] = i;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    if (a[i] != i)
+      return 1;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    a[i] = 0;
+  
+  _Cilk_for (i = (ARRAY_SIZE-1); i >= 0; i -= 1) 
+    a[i] = i;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    if (a[i] != i)
+      return 1;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    a[i] = 0;
+  
+  _Cilk_for (i = (ARRAY_SIZE-1); i >= 0; --i) 
+    a[i] = i;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    if (a[i] != i)
+      return 1;
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_errors.c
new file mode 100644
index 0000000..022513f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_errors.c
@@ -0,0 +1,64 @@
+struct foo
+{
+  int x,y,z;
+  char q;
+};
+
+int main (void)
+{
+  int q = 0, ii = 0, jj = 0;
+  volatile int vii = 0;
+  static int sii = 0;
+  register int rii = 0;
+  extern int eii;
+  struct foo something, nothing;
+  float fii = 0;
+  _Cilk_for (ii; ii < 10; ii++) /* { dg-error " expected induction variable initialization" } */
+    q = 5;
+
+  _Cilk_for (; ii < 10; ii++) /* { dg-error "expected induction variable initialization" } */
+    q = 2;
+
+  _Cilk_for (ii = 0; ; ii++) /* { dg-error "missing condition" } */
+    q = 2;
+
+  _Cilk_for (ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+    q = 5;
+
+  _Cilk_for (ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
+    q = 5;
+
+  _Cilk_for (ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected ';' before ',' token" } */
+    q = 5;
+
+  _Cilk_for (vii = 0; vii < 10; vii++) /* { dg-error "induction variable cannot be volatile" } */
+    q = 5;
+
+  _Cilk_for (sii = 0; sii < 10; sii++) /* { dg-error "induction variable cannot be static" } */
+    q = 5;
+
+  _Cilk_for (rii = 0; rii < 10; rii++) /* { dg-error "induction variable cannot be declared register" } */
+    q = 5;
+
+  _Cilk_for (eii = 0; eii < 10; eii++)  /* { dg-error "induction variable cannot be extern" } */
+    q = 5;
+
+  _Cilk_for (something = nothing; ii < 10; ii++) /* { dg-error "induction variable must be of integral or" } */
+    q = 5;
+
+  _Cilk_for (fii = 3.47; fii < 5.23; ii++) /* { dg-error "induction variable must be of integral or pointer type" } */
+    q = 5;
+
+  _Cilk_for (ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
+    q = 5;
+
+  _Cilk_for (ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */
+    q = 5;
+
+  _Cilk_for (ii = 10; ii >= 0; ii--) /* This is OK!  */
+    q = 5;
+
+  _Cilk_for (ii; ii < 10; ii++) /* { dg-error "expected induction variable initialization" } */
+    q = 5;
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain.c
new file mode 100644
index 0000000..4c86bf6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain.c
@@ -0,0 +1,34 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus -w" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int grain_value = 2;
+int main (void)
+{
+  int Array1[200], Array1_Serial[200];
+  int ii = 0; 
+
+  for (ii = 0; ii < 200; ii++)
+    {
+      Array1_Serial[ii] = 2;
+      Array1[ii] = 1;
+    }
+
+#pragma cilk grainsize = 2
+  _Cilk_for (ii = 0; ii < 200; ii++)
+    Array1[ii] = 2;
+
+  for (ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+#pragma cilk grainsize = grain_value
+  _Cilk_for (ii = 0; ii < 200; ii++) 
+    Array1[ii] = 2;
+
+  for (ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain_errors.c
new file mode 100644
index 0000000..36d75df
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain_errors.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -fcilkplus" } */
+char Array1[26];
+
+#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */
+
+int main(int argc, char **argv)
+{
+  int ii = 0;
+
+/* This is OK.  */
+#pragma cilk grainsize = 2
+  _Cilk_for (ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+
+/* This is OK, it will do a type conversion to long int.  */
+#pragma cilk grainsize = 0.5 
+  _Cilk_for (ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_ptr_iter.c
new file mode 100644
index 0000000..8eec6be
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_ptr_iter.c
@@ -0,0 +1,28 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus -w" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+/* <feature> loop control variable must have integer, pointer or class type
+   </feature>
+*/
+
+#define ARRAY_SIZE 10000
+int a[ARRAY_SIZE];
+
+int main(void)
+{ 
+  int *aa = 0;
+  int ii = 0;
+
+  for (ii =0; ii < ARRAY_SIZE; ii++)
+    a[ii] = 5;
+
+  _Cilk_for(aa = a; aa < a + ARRAY_SIZE; aa++) 
+    *aa = 0;
+
+  for (ii = 0; ii < ARRAY_SIZE; ii++) 
+    if (a[ii] != 0) 
+      return 1;
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_warning.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_warning.c
new file mode 100644
index 0000000..f39eb7b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_warning.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+int main (void)
+{
+  int ii = 0, q = 2;
+  _Cilk_for (ii = 0; ii < 10; ii++) /* { dg-warning "loop body modifies control variable" } */
+    ii += q;
+  
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp b/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
index 7407e8e..edab8eb 100644
--- a/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
+++ b/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
@@ -60,5 +60,12 @@ if { [check_effective_target_lto] } {
     dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -flto -g -fcilkplus $ALWAYS_CFLAGS" " "
 }
 
-
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -g -fcilkplus $ALWAYS_CFLAGS " " "
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O1 -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O2 -std=c99 -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O2 -ftree-vectorize -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O3 -g -fcilkplus $ALWAYS_CFLAGS" " "
+if { [check_effective_target_lto] } {
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O3 -flto -g -fcilkplus $ALWAYS_CFLAGS" " "
+}
 dg-finish
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7fe849d..de2a24b 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -2661,6 +2661,29 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
       pp_string (buffer, "_Cilk_sync");
       break;
 
+    case CILK_FOR_STMT:
+      if (CILK_FOR_GRAIN (node))
+	{
+	  pp_string (buffer, "#pragma cilk grainsize = ");
+	  dump_generic_node (buffer, CILK_FOR_GRAIN (node), spc, flags, false); 
+	  newline_and_indent (buffer, spc);
+	}
+      pp_string (buffer, "_Cilk_for (");
+      dump_generic_node (buffer, CILK_FOR_INIT (node), spc, flags, false);
+      pp_string (buffer, "; ");
+      dump_generic_node (buffer, CILK_FOR_COND (node), spc, flags, false);
+      pp_string (buffer, "; ");
+      dump_generic_node (buffer, CILK_FOR_EXPR (node), spc, flags, false);
+      pp_string (buffer, ")");
+      newline_and_indent (buffer, spc + 2);
+      pp_left_brace (buffer);
+      newline_and_indent (buffer, spc + 4);
+      dump_generic_node (buffer, CILK_FOR_BODY (node), spc + 4, flags, false);
+      newline_and_indent (buffer, spc + 2);
+      pp_right_brace (buffer);
+      is_expr = false;
+      break;
+
     default:
       NIY;
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index 8eecba7..9c0bfe2 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1285,6 +1285,16 @@ DEFTREECODE (CILK_SPAWN_STMT, "cilk_spawn_stmt", tcc_statement, 1)
 /* Cilk Sync statement: Does not have any operands.  */
 DEFTREECODE (CILK_SYNC_STMT, "cilk_sync_stmt", tcc_statement, 0)
 
+/* Cilk for statement
+   Operand 0 is the initializer.
+   Operand 1 is the loop terminating condition.
+   Operand 2 is the increment/decrement expression.
+   Operand 3 is the loop-body.
+   Operand 4 is the scope.
+   Operand 5 is the induction variable.
+   Operand 6 is the grain that is passed in through a pragma.  */
+DEFTREECODE (CILK_FOR_STMT, "cilk_for_stmt", tcc_statement, 7)
+
 /*
 Local variables:
 mode:c
diff --git a/gcc/tree.h b/gcc/tree.h
index e58b3a5..000a448 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -804,6 +804,15 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
 /* Cilk keywords accessors.  */
 #define CILK_SPAWN_FN(NODE) TREE_OPERAND (CILK_SPAWN_STMT_CHECK (NODE), 0)
 
+/* CILK_FOR_STMT accessors.  */
+#define CILK_FOR_INIT(NODE)     TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 0)
+#define CILK_FOR_COND(NODE)     TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 1)
+#define CILK_FOR_EXPR(NODE)     TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 2)
+#define CILK_FOR_BODY(NODE)     TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 3)
+#define CILK_FOR_SCOPE(NODE)    TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 4)
+#define CILK_FOR_VAR(NODE)      TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 5)
+#define CILK_FOR_GRAIN(NODE)    TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 6)
+
 /* In a RESULT_DECL, PARM_DECL and VAR_DECL, means that it is
    passed by invisible reference (and the TREE_TYPE is a pointer to the true
    type).  */

[-- Attachment #3: diff_cilk_for_c++.txt --]
[-- Type: text/plain, Size: 62801 bytes --]

diff --git a/gcc/cp/cp-cilk.c b/gcc/cp/cp-cilk.c
index 0da95e8..bd114c8
--- a/gcc/cp/cp-cilk.c
+++ b/gcc/cp/cp-cilk.c
@@ -23,8 +23,13 @@
 #include "system.h"
 #include "coretypes.h"
 #include "cp-tree.h"
+#include "tree.h"
 #include "tree-iterator.h"
 #include "cilk.h"
+#include "langhooks.h"
+#include "cgraph.h"
+#include "gimple.h"
+#include "gimplify.h"
 
 /* Sets the EXCEPTION bit (0x10) in the FRAME.flags field.  */
 
@@ -116,3 +121,483 @@ cilk_create_lambda_fn_tmp_var (tree lambda_fn)
   add_local_decl (cfun, return_var);
   return return_var;
 }
+
+/* Returns an overloaded function that does operation based on CODE using
+   OP0 and OP1.  If CRY is set to true, then the function complains when
+   it is unable to find an overloaded operator.  */
+
+static tree
+callable (location_t loc, enum tree_code code, tree op0, tree op1, bool cry)
+{
+  vec<tree, va_gc> *op1_vec = make_tree_vector_single (op1);
+  if (code == INIT_EXPR)
+    return build_special_member_call (NULL_TREE, complete_ctor_identifier,
+				      &op1_vec,
+				      TYPE_MAIN_VARIANT (TREE_TYPE (op1)), 0,
+				      cry);
+    
+  if (code == PSEUDO_DTOR_EXPR)
+    return build_special_member_call (NULL_TREE, complete_dtor_identifier,
+				      &op1_vec,
+				      TYPE_MAIN_VARIANT (TREE_TYPE (op1)), 0,
+				      cry);
+
+  int flags = LOOKUP_PROTECT | LOOKUP_ONLYCONVERTING;
+  tree exp = build_new_op (EXPR_LOCATION (op1), code, flags, op0, op1,
+			   NULL_TREE, NULL, 0);
+  if (exp == error_mark_node)
+    exp = build_x_modify_expr (EXPR_LOCATION (op1), op0, code, op1, tf_none);
+  if (exp && exp != error_mark_node)
+    return exp;
+
+  const char *op = operator_name_info[(int) code].name;
+  const char *explain = cry ? "" : "accessible, unambiguous";
+  if (op1) 
+    error_at (loc, "No%s operator%s(%T,%T) for _Cilk_for loop", explain, op, 
+	      TREE_TYPE (op0), TREE_TYPE (op1)); 
+  else 
+    error_at (loc, "No%s operator%s(%T,%T) for _Cilk_for loop", explain, op, 
+	      TREE_TYPE (op0), TREE_TYPE (op0));
+  return NULL_TREE;
+}
+
+/* Calculates the COUNT_UP and/or COUNT_DOWN values for a _Cilk_for loop using
+   its characteristics stored in *CFD.  */
+
+static void
+calc_count_up_count_down (struct cilk_for_desc *cfd, tree *count_up,
+			  tree *count_down)
+{
+  /* Reasoning for high and low variables can be found in
+     cilk_compute_loop_count in c-family/cilk.c.  */
+  tree high = cfd->end_var ? cfd->end_var : cfd->end_expr;
+  tree low = cfd->lower_bound ? cfd->lower_bound : cfd->var;
+
+  /* When these are invalid, we flag them in cilk_compute_loop_var.  This
+     condition is a bit rare.  */
+  if (high == error_mark_node || low == error_mark_node)
+    return;
+  
+  /* Only call this function if we are using an iterator.  */
+  gcc_assert (cfd->iterator);
+  
+  if (TREE_CODE (high) == TARGET_EXPR)
+    high = TARGET_EXPR_INITIAL (high);
+  if (TREE_CODE (low) == TARGET_EXPR)
+    low = TARGET_EXPR_INITIAL (low);
+  
+  if (TREE_CODE (low) == TREE_LIST)
+    low = TREE_VALUE (low);
+  high = cilk_tree_operand_noconv (high);
+  if (cfd->direction >= 0)
+    {
+      *count_up = build_x_binary_op (cfd->loc, MINUS_EXPR, high,
+				     TREE_CODE (high), low, TREE_CODE (low),
+				     NULL, tf_warning_or_error);
+      /* We should have already failed if this operator is not callable.  */
+      gcc_assert (*count_up != error_mark_node);
+    }
+  else
+    {
+      *count_down = build_x_binary_op (cfd->loc, MINUS_EXPR, low,
+				       TREE_CODE (low), high, TREE_CODE (high),
+				       NULL, tf_warning_or_error);
+      /* ...same reasoning as count up for the assert below.  */
+      gcc_assert (*count_down != error_mark_node);
+    }
+}
+
+/* Handler for iterator to compute the loop variable.  ADD_OP indicates
+   whether we need a '+' or '-' operation. LOW indicates the starting point
+   and LOOP_VAR is the induction variable.  Returns an expression (or a
+   STATEMENT_LIST of expressions).  If it is unable to find the appropriate
+   iteration, then it returns an error mark node and its parent will set
+   the loop as invalid.  */
+
+static tree
+compute_loop_var_cp_iter_hdl (location_t loc, enum tree_code add_op,
+			      tree low, tree loop_var, tree var2)
+{
+  tree exp = build_new_op (loc, add_op, 0, low, loop_var, NULL_TREE, 0,
+			   tf_none);
+  if (exp == error_mark_node)
+    {
+      /* If we are here then operator+ or operator- could not be found.
+	 So, the other option is to use +=.  This requires storing values
+	 in the variable and then adding them one by one.  */
+      tree new_var = var2;
+      exp = alloc_stmt_list ();
+      tree new_stmt = build_x_modify_expr (loc, new_var, INIT_EXPR,
+					   build_zero_cst (TREE_TYPE (new_var)),
+					   tf_warning_or_error);
+      if (new_stmt == error_mark_node)
+	return error_mark_node;
+      append_to_statement_list (new_stmt, &exp);
+      new_stmt = build_x_modify_expr (loc, new_var, NOP_EXPR, low,
+				      tf_warning_or_error);
+      if (new_stmt == error_mark_node)
+	return error_mark_node;
+      append_to_statement_list (new_stmt, &exp);
+      new_stmt = build_x_modify_expr (loc, new_var, add_op, loop_var,
+				      tf_warning_or_error);
+      if (new_stmt == error_mark_node)
+	return error_mark_node;
+      append_to_statement_list (new_stmt, &exp);
+      return exp;
+    }
+  exp = cp_build_modify_expr (var2, INIT_EXPR, exp, tf_warning_or_error);
+  return exp;
+}
+
+/* Returns the body of the nested function for a _Cilk_for using the loop's
+   characteristic information from CFD.  The returned tree will be a
+   STATEMENT LIST.  */
+
+static tree
+cp_create_cilk_for_body (struct cilk_for_desc *cfd)
+{
+  push_function_context ();
+  declare_cilk_for_parms (cfd);
+  cfd->wd.fntype = build_function_type (void_type_node, cfd->wd.argtypes);
+
+  tree fndecl = cilk_create_cilk_helper_decl (&cfd->wd);
+  fndecl = build_lang_decl (FUNCTION_DECL, DECL_NAME (fndecl), cfd->wd.fntype);
+  if (cfd->nested_ok)
+    DECL_CONTEXT (fndecl) = current_function_decl;
+  else
+    DECL_CONTEXT (fndecl) = DECL_CONTEXT (current_function_decl);
+
+  tree outer = current_function_decl;
+  SET_DECL_LANGUAGE (fndecl, lang_c);
+  start_preparsed_function (fndecl, NULL_TREE, SF_PRE_PARSED);
+
+  declare_cilk_for_vars (cfd, fndecl);
+  
+  tree lower_bound = cfd->lower_bound;
+  struct gimplify_ctx gctx;
+
+  tree body = begin_compound_stmt (BCS_FN_BODY);
+  push_gimplify_context (&gctx);
+
+  gimple_add_tmp_var (cfd->var2);
+
+  /* Get the lower bound into a variable unless it is a constant or a
+     non-copyable value.  If non-copyable value, then reference value from
+     the outer frame.  */
+  if (!lower_bound)
+    {
+      lower_bound = cfd->var;
+      tree hack = build_decl (cfd->loc, VAR_DECL, NULL_TREE,
+			      TREE_TYPE (lower_bound));
+      DECL_CONTEXT (hack) = DECL_CONTEXT (lower_bound);
+      *pointer_map_insert (cfd->wd.decl_map, hack) = lower_bound;
+      lower_bound = hack;
+    }
+  tree cast_max_expr, count_type, pre, loop_var;
+  if (INTEGRAL_TYPE_P (cfd->var_type))
+    {
+      loop_var = create_tmp_var (cfd->var_type, NULL);
+      count_type = cfd->var_type;
+      tree cvt_expr = cp_fold_convert (cfd->var_type, cfd->min_parm);
+      pre = build_x_modify_expr (cfd->loc, loop_var, NOP_EXPR, cvt_expr,
+				 tf_warning_or_error);
+      cast_max_expr = cp_fold_convert (count_type, cfd->max_parm);
+    }
+  else
+    {
+      loop_var = create_tmp_var (TREE_TYPE (cfd->min_parm), NULL);
+      count_type = cfd->count_type;
+      pre = fold_build2 (INIT_EXPR, void_type_node, loop_var, cfd->min_parm);
+      cast_max_expr = cfd->max_parm;
+    }
+
+  tree loop_body = alloc_stmt_list ();
+  
+  /* Concat. the control variable initialization with the loop body.
+     Do not call gimplify_and_add to append to list because we need
+     to wrap the entire list in a cleanup point expr to delay destruction
+     of the control variable to the end of the loop if it is an iterator.  */
+  tree loop_end_comp = cilk_compute_loop_var (cfd, loop_var, lower_bound,
+					      compute_loop_var_cp_iter_hdl);
+  if (loop_end_comp == error_mark_node)
+    {
+      cfd->invalid = true;
+      return error_mark_node;
+    }
+  append_to_statement_list (loop_end_comp, &loop_body);
+  tree cleanup = cxx_maybe_build_cleanup (cfd->var2, tf_none);
+  if (cleanup)
+    {
+      append_to_statement_list (cfd->body, &loop_body);
+      append_to_statement_list (cleanup, &loop_body);
+    }
+  else
+    append_to_statement_list (cfd->body, &loop_body);
+
+  loop_body = fold_build_cleanup_point_expr (void_type_node, loop_body);
+  DECL_SEEN_IN_BIND_EXPR_P (cfd->var2) = 1;
+
+  cfd->wd.context = outer;
+  bool throws = flag_exceptions ? cp_function_chain->can_throw : false;
+  cilk_outline_body (fndecl, &loop_body, &cfd->wd, &throws);
+  cp_function_chain->can_throw = throws;
+  
+  /* We have to manually create this loop for two reasons:
+     a. We need to have access to continue and start label since we need
+        to resolve continue and breaks by hand.
+     b. C++ doesn't provide a c_finish_loop function like C does.  */
+  tree c_for_loop = push_stmt_list ();
+  tree slab = build_decl (cfd->loc, LABEL_DECL, NULL_TREE, void_type_node);
+  DECL_ARTIFICIAL (slab) = 0;
+  DECL_IGNORED_P (slab) = 1;
+  DECL_CONTEXT (slab) = fndecl;
+  tree top_label = build1 (LABEL_EXPR, void_type_node, slab);
+
+  tree cont_lab = build_decl (cfd->loc, LABEL_DECL, NULL_TREE, void_type_node);
+  DECL_ARTIFICIAL (cont_lab) = 0;
+  DECL_IGNORED_P (cont_lab) = 1;
+  DECL_CONTEXT (cont_lab) = fndecl;
+
+  tree continue_label = build1 (LABEL_EXPR, void_type_node, cont_lab);
+  tree loop_cond = fold_build2 (LT_EXPR, boolean_type_node, loop_var,
+				cast_max_expr);
+  tree cond_expr = build3 (COND_EXPR, void_type_node, loop_cond,
+			   build1 (GOTO_EXPR, void_type_node, slab),
+			   build_empty_stmt (cfd->loc));
+  tree mod_expr = fold_build2 (MODIFY_EXPR, void_type_node, loop_var,
+				build2 (PLUS_EXPR, count_type, loop_var,
+					build_one_cst (count_type)));
+  add_stmt (pre);
+  add_stmt (top_label);
+  add_stmt (loop_body);
+  add_stmt (continue_label);
+  add_stmt (mod_expr);
+  add_stmt (cond_expr);
+  pop_stmt_list (c_for_loop);
+
+  /* Resolve all the continues in the _Cilk_for body here.  */
+  walk_tree (&c_for_loop, cilk_resolve_continue_stmts, (void *) cont_lab, NULL);
+  add_stmt (c_for_loop);
+
+  DECL_INITIAL (fndecl) = make_node (BLOCK);
+  TREE_USED (DECL_INITIAL (fndecl)) = 1;
+  BLOCK_VARS (DECL_INITIAL (fndecl)) = loop_var;
+  TREE_CHAIN (loop_var) = cfd->var2;
+
+  body = build3 (BIND_EXPR, void_type_node, loop_var, body,
+		 DECL_INITIAL (fndecl));
+  DECL_CONTEXT (cfd->var2) = fndecl;
+  pop_gimplify_context (0);
+
+  finish_function_body (body);
+  
+  /* A nested function canot be expanded or deferred until its parent is done.
+     So, don't call expand_or_defer_fn here.  A non-nested function must be
+     done here.  */
+  if (!cfd->nested_ok)
+    expand_or_defer_fn (fndecl);
+  
+  pop_function_context ();
+  return fndecl;
+}
+
+/* Creates a nested function for the _Cilk_for statement using its information
+   in CFD.  PRE_P is the preceeding gimple trees function.  */
+
+static tree
+create_cilk_for_nested_fn (struct cilk_for_desc *cfd, gimple_seq *pre_p)
+{
+  tree var = cfd->var;
+  DECL_CONTEXT (var) = current_function_decl;
+
+  if (POINTER_TYPE_P (TREE_TYPE (var)))
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_WRITE);
+  else
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_READ);
+
+  tree incr = cfd->incr;
+
+  /* If the loop increment is not an integer constant and is not a DECL,
+     copy it to a temporary.  if it is modified during the loop the behavior
+     is undefined.  Races could be avoided by copying it to a temporary
+     variable.  */
+  if (TREE_CODE (incr) != INTEGER_CST && !DECL_P (incr))
+    {
+      incr = get_formal_tmp_var (incr, pre_p);
+      cfd->incr = incr;
+    }
+
+  if (DECL_P (incr) && !TREE_STATIC (incr) && !DECL_EXTERNAL (incr))
+    *pointer_map_insert (cfd->wd.decl_map, incr) = incr;
+
+  /* Map the loop variable to integer_minus_one_node if we won't really be
+     passing it into hte loop body.  Otherwise map to integer_zero_node.  */
+  *pointer_map_insert (cfd->wd.decl_map, var) =
+    (void *) (cfd->lower_bound ? integer_minus_one_node : integer_zero_node);
+  cilk_extract_free_variables (cfd->body, &cfd->wd, ADD_READ);
+
+  tree fn = cp_create_cilk_for_body (cfd);
+
+  /* One of the reasons why FN is error_mark_node is because the function
+     couldn't find the appropriate overloaded operation.  */
+  if (fn == error_mark_node)
+    return error_mark_node;
+
+  DECL_UNINLINABLE (fn) = 1;
+  DECL_STATIC_CHAIN (fn) = 1;
+
+  current_function_decl = fn;
+  /* Genericize the _Cilk_for body, mainly split up the _Cilk_for body and
+     the for-loop we inserted.  */
+  cp_genericize (fn);
+  return fn;
+}
+
+/* Helper function to gimplify a CILK_FOR_STMT.  CFD holds all the values
+   extracted a CILK_FOR_STMT and *PRE_P is the preceeding sequence.  */
+
+static void
+gimplify_cilk_for_1 (struct cilk_for_desc cfd, gimple_seq *pre_p)
+{
+  bool order_variable = false;
+  tree parent_function = current_function_decl;
+  
+  if (TREE_SIDE_EFFECTS (cfd.end_expr))
+    {
+      enum tree_code ecode = TREE_CODE (cfd.end_expr);
+      if (ecode == INIT_EXPR || ecode == MODIFY_EXPR)
+	cfd.end_var = TREE_OPERAND (cfd.end_expr, 0);
+      else if (ecode == TARGET_EXPR)
+	{
+	  cfd.end_var = TARGET_EXPR_INITIAL (cfd.end_expr);
+	  if (TREE_CODE (cfd.end_var) == AGGR_INIT_EXPR)
+	    cfd.end_var = TARGET_EXPR_SLOT (cfd.end_expr);
+	  else
+	    cfd.end_var = get_formal_tmp_var (cfd.end_var, pre_p);
+	}
+      else if (ecode == CALL_EXPR)
+	cfd.end_var = cfd.end_expr;
+      else
+	{
+	  tree ii_tree = cfd.end_expr;
+	  while (TREE_CODE_CLASS (TREE_CODE (ii_tree)) == tcc_unary)
+	    ii_tree = TREE_OPERAND (ii_tree, 0);
+	  if (TREE_CODE (ii_tree) == ADDR_EXPR)
+	    ii_tree = TREE_OPERAND (ii_tree, 0);
+	  ecode = TREE_CODE (ii_tree);
+	  tree tmp_var = cilk_tree_operand_noconv (cfd.end_expr);
+	  cfd.end_var = get_formal_tmp_var (tmp_var, pre_p);
+	  order_variable = true;
+	}
+    }
+  tree cond = cfd.cond;
+  tree op1 = TREE_OPERAND (cond, 1);
+  tree op0 = TREE_OPERAND (cond, 0);
+  enum tree_code cond_code = TREE_CODE (cond);
+
+  /* In this case below, we have an overloaded boolean comparison operation.  */
+  if (cond_code == CALL_EXPR)
+    {
+      cond_code = cilk_find_code_from_call (CALL_EXPR_FN (cond));
+      op1 = cilk_tree_operand_noconv (CALL_EXPR_ARG (cond, 1));
+      op0 = cilk_tree_operand_noconv (CALL_EXPR_ARG (cond, 0));
+      if (TREE_CODE (op0) == ADDR_EXPR || TREE_CODE (op0) == INDIRECT_REF)
+	op0 = TREE_OPERAND (op0, 0);
+    }
+  if (order_variable && op1 == cfd.end_expr)
+    op1 = cfd.end_var;
+  else if (order_variable && op0 == cfd.end_expr)
+    op0 = cfd.end_var;
+  
+  cond = callable (cfd.loc, cond_code, op0, op1, false);
+  gcc_assert (cond != NULL_TREE);
+
+  if (TREE_CODE (TREE_TYPE (cond)) != BOOLEAN_TYPE)
+    cond = perform_implicit_conversion (boolean_type_node, cond,
+					tf_warning_or_error);
+  enum tree_code div_op = NOP_EXPR;
+  tree forward = NULL_TREE, count_up = NULL_TREE, count_down = NULL_TREE;
+  cilk_calc_forward_div_op (&cfd, &div_op, &forward);
+  if (cfd.iterator)
+    calc_count_up_count_down (&cfd, &count_up, &count_down);
+  
+  tree count = cilk_compute_loop_count (&cfd, div_op, forward, count_up,
+					count_down);
+  tree fn = create_cilk_for_nested_fn (&cfd, pre_p);
+  if (fn == error_mark_node)
+    return;
+  cfd.cond = cond;
+  
+  current_function_decl = parent_function;
+  gimple_seq inner_seq = insert_cilk_for_nested_fn (&cfd, count, fn);
+  gimple_seq_add_seq (pre_p, inner_seq);
+}
+
+/* Extract all the relevant information from CFOR, a CILK_FOR_STMT tree
+   and store them in CFD structure.  */
+
+static void
+cp_extract_cilk_for_fields (struct cilk_for_desc *cfd, tree cfor)
+{
+  cfd->var = CILK_FOR_VAR (cfor);
+  cfd->cond = CILK_FOR_COND (cfor);
+  cfd->lower_bound = CILK_FOR_INIT (cfor);
+  cfd->incr = CILK_FOR_EXPR (cfor);
+  cfd->loc = EXPR_LOCATION (cfor);
+  cfd->body = CILK_FOR_BODY (cfor);
+  cfd->grain = CILK_FOR_GRAIN (cfor);
+  cfd->invalid = false;
+
+  /* This function shouldn't be setting these two variables.  */
+  cfd->ctx_arg = NULL_TREE;
+  cfd->count = NULL_TREE;
+  
+  cilk_set_init_info (cfd);
+  cilk_set_inclusive_and_direction (cfd);
+  cilk_set_iter_difftype (cfd);
+
+  if (cfd->iterator)
+    {
+      tree exp = NULL_TREE;
+      tree hack = build_decl (cfd->loc, VAR_DECL, NULL_TREE,
+			      TREE_TYPE (cfd->var));
+      if (cfd->direction >= 0)
+	exp = callable (cfd->loc, MINUS_EXPR, hack, cfd->var,true);
+      else
+	exp = callable (cfd->loc, MINUS_EXPR, cfd->var, hack, true);
+      if (!exp) 
+	{ 
+	  cfd->invalid = true;
+	  return;
+	}
+      cfd->difference_type = TYPE_MAIN_VARIANT (TREE_TYPE (exp));
+    }
+  cfd->count_type = cilk_check_loop_difference_type (cfd->difference_type);
+  cilk_set_incr_info (cfd, true);
+}
+
+/* Entry function to gimplify a CILK_FOR_STMT, *FOR_P.  *PRE_P and *POST_P are
+    preceeding and proceeding gimple sequences of *FOR_P, respectively.  */
+
+int
+cp_gimplify_cilk_for (tree *for_p, gimple_seq *pre_p,
+		      gimple_seq *post_p ATTRIBUTE_UNUSED)
+{
+  struct cilk_for_desc cfd;
+
+  cfun->is_cilk_function = 1;
+  cilk_init_cfd (&cfd);
+
+  cp_extract_cilk_for_fields (&cfd, *for_p);
+  if (cfd.invalid)
+    {
+      *for_p = build_empty_stmt (cfd.loc);
+      return GS_ERROR;
+    }
+  cfd.nested_ok = !DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (current_function_decl);
+  gimplify_cilk_for_1 (cfd, pre_p);
+  *for_p = NULL_TREE;
+
+  return GS_ALL_DONE;
+}
+
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index e8ccf1a..ed630bf 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -268,6 +268,23 @@ genericize_cp_loop (tree *stmt_p, location_t start_locus, tree cond, tree body,
   *stmt_p = stmt_list;
 }
 
+/* Genericize a CILK_FOR_STMT node *STMT_P.  */
+
+static void
+genericize_cilk_for_stmt (tree *stmt_p, int *walk_subtrees, void *data)
+{
+  tree stmt = *stmt_p;
+  cp_walk_tree (&CILK_FOR_COND (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_INIT (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_GRAIN (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_VAR (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_EXPR (stmt), cp_genericize_r, data, NULL);
+
+  /* _Cilk_for body will be resolved after it is inserted into a nested
+     function.  */
+  *walk_subtrees = 0;
+} 
+
 /* Genericize a FOR_STMT node *STMT_P.  */
 
 static void
@@ -1120,6 +1137,8 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void *data)
     gcc_assert (!CONVERT_EXPR_VBASE_PATH (stmt));
   else if (TREE_CODE (stmt) == FOR_STMT)
     genericize_for_stmt (stmt_p, walk_subtrees, data);
+  else if (TREE_CODE (stmt) == CILK_FOR_STMT)
+    genericize_cilk_for_stmt (stmt_p, walk_subtrees, data);
   else if (TREE_CODE (stmt) == WHILE_STMT)
     genericize_while_stmt (stmt_p, walk_subtrees, data);
   else if (TREE_CODE (stmt) == DO_STMT)
diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h
index 77a66c3..baf3ee3 100644
--- a/gcc/cp/cp-objcp-common.h
+++ b/gcc/cp/cp-objcp-common.h
@@ -164,4 +164,7 @@ extern void cp_common_init_ts (void);
 #undef  LANG_HOOKS_CILKPLUS_FRAME_CLEANUP
 #define LANG_HOOKS_CILKPLUS_FRAME_CLEANUP cp_cilk_install_body_wframe_cleanup
 
+#undef  LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR
+#define LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR cp_gimplify_cilk_for
+
 #endif /* GCC_CP_OBJCP_COMMON */
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 9a950f0..c0c0291 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5686,6 +5686,9 @@ extern void finish_for_init_stmt		(tree);
 extern void finish_for_cond			(tree, tree, bool);
 extern void finish_for_expr			(tree, tree);
 extern void finish_for_stmt			(tree);
+extern tree begin_cilk_for_stmt                 (tree, tree);
+extern void finish_cilk_for_init_stmt           (tree);
+extern tree finish_cilk_for_stmt                (tree);
 extern tree begin_range_for_stmt		(tree, tree);
 extern void finish_range_for_decl		(tree, tree, tree);
 extern void finish_range_for_stmt		(tree);
@@ -6193,6 +6196,8 @@ extern int gimplify_cilk_spawn                  (tree *, gimple_seq *,
 /* In cp/cp-cilk.c */
 extern void cp_cilk_install_body_wframe_cleanup (tree, tree);
 extern tree cilk_create_lambda_fn_tmp_var       (tree);
+extern int cp_gimplify_cilk_for                 (tree *, gimple_seq *,
+						 gimple_seq *);
 /* -- end of C++ */
 
 #endif /* ! GCC_CP_TREE_H */
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index ced596e..ae03c56 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -1542,6 +1542,7 @@ begin_scope (scope_kind kind, tree entity)
     case sk_try:
     case sk_catch:
     case sk_for:
+    case sk_cilk_for:
     case sk_cond:
     case sk_class:
     case sk_scoped_enum:
diff --git a/gcc/cp/name-lookup.h b/gcc/cp/name-lookup.h
index 57641a1..66d1876 100644
--- a/gcc/cp/name-lookup.h
+++ b/gcc/cp/name-lookup.h
@@ -107,6 +107,8 @@ typedef enum scope_kind {
   sk_catch,	     /* A catch-block.  */
   sk_for,	     /* The scope of the variable declared in a
 			for-init-statement.  */
+  sk_cilk_for,       /* The scope of the variable declared in _Cilk_for init
+			statement.  */
   sk_cond,	     /* The scope of the variable declared in the condition
 			of an if or switch statement.  */
   sk_function_parms, /* The scope containing function parameters.  */
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 3a22e90..25e2796 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -237,6 +237,8 @@ static void cp_parser_cilk_simd_construct
   (cp_parser *, cp_token *);
 static tree cp_parser_cilk_for
   (cp_parser *, enum rid, tree);
+static void cp_parser_cilk_grainsize
+  (cp_parser *, cp_token *);
 static bool cp_parser_omp_declare_reduction_exprs
   (tree, cp_parser *);
 
@@ -2060,7 +2062,8 @@ static tree cp_parser_decltype
 /* Declarators [gram.dcl.decl] */
 
 static tree cp_parser_init_declarator
-  (cp_parser *, cp_decl_specifier_seq *, vec<deferred_access_check, va_gc> *, bool, bool, int, bool *, tree *);
+  (cp_parser *, cp_decl_specifier_seq *, vec<deferred_access_check, va_gc> *,
+   bool, bool, int, bool *, tree *, tree *);
 static cp_declarator *cp_parser_declarator
   (cp_parser *, cp_parser_declarator_kind, int *, bool *, bool);
 static cp_declarator *cp_parser_direct_declarator
@@ -9350,6 +9353,7 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 
 	case RID_WHILE:
 	case RID_DO:
+	case RID_CILK_FOR:
 	case RID_FOR:
 	  statement = cp_parser_iteration_statement (parser, false);
 	  break;
@@ -10505,6 +10509,17 @@ cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
       }
       break;
 
+    case RID_CILK_FOR:
+      if (!flag_enable_cilkplus)
+	{ 
+	  error_at (token->location, 
+		    "-fcilkplus must be enabled t use %<_Cilk_for%>");
+	  statement = error_mark_node;
+	}
+      else
+	statement = cp_parser_cilk_for (parser, RID_CILK_FOR, NULL_TREE);
+      break;
+
     default:
       cp_parser_error (parser, "expected iteration-statement");
       statement = error_mark_node;
@@ -10624,6 +10639,10 @@ cp_parser_jump_statement (cp_parser* parser)
 	case IN_OMP_FOR:
 	  error_at (token->location, "break statement used with OpenMP for loop");
 	  break;
+	case IN_CILK_FOR_STMT:
+	  error_at (token->location,
+		    "break statement used in _Cilk_for loop body");
+	  break;
 	case IN_CILK_P_SIMD_FOR:
 	  error_at (token->location,
 		    "break statement within <#pragma simd> loop body");
@@ -10639,6 +10658,7 @@ cp_parser_jump_statement (cp_parser* parser)
 	  error_at (token->location, "continue statement not within a loop");
 	  break;
 	case IN_ITERATION_STMT:
+	case IN_CILK_FOR_STMT:
 	case IN_OMP_FOR:
 	  statement = finish_continue_stmt ();
 	  break;
@@ -11189,7 +11209,7 @@ cp_parser_simple_declaration (cp_parser* parser,
 					/*member_p=*/false,
 					declares_class_or_enum,
 					&function_definition_p,
-					maybe_range_for_decl);
+					maybe_range_for_decl, NULL);
       /* If an error occurred while parsing tentatively, exit quickly.
 	 (That usually happens when in the body of a function; each
 	 statement is treated as a declaration-statement until proven
@@ -16440,7 +16460,8 @@ cp_parser_init_declarator (cp_parser* parser,
 			   bool member_p,
 			   int declares_class_or_enum,
 			   bool* function_definition_p,
-			   tree* maybe_range_for_decl)
+			   tree* maybe_range_for_decl,
+			   tree* init)
 {
   cp_token *token = NULL, *asm_spec_start_token = NULL,
            *attributes_start_token = NULL;
@@ -16448,7 +16469,9 @@ cp_parser_init_declarator (cp_parser* parser,
   tree prefix_attributes;
   tree attributes = NULL;
   tree asm_specification;
-  tree initializer;
+  /* Initialize initalizer to remove a "using potentially unset variable"
+     warning/error.  */
+  tree initializer = NULL_TREE;
   tree decl = NULL_TREE;
   tree scope;
   int is_initialized;
@@ -16585,7 +16608,8 @@ cp_parser_init_declarator (cp_parser* parser,
 	      DECL_STRUCT_FUNCTION (decl)->function_start_locus
 		= func_brace_location;
 	    }

+	  if (init)
+	    *init = initializer;
 	  return decl;
 	}
     }
@@ -16820,6 +16844,8 @@ cp_parser_init_declarator (cp_parser* parser,
 	finish_fully_implicit_template (parser, /*member_decl_opt=*/0);
     }
 
+  if (init)
+    *init = initializer;
   return decl;
 }
 
@@ -22985,6 +23011,7 @@ cp_parser_single_declaration (cp_parser* parser,
 				        member_p,
 				        declares_class_or_enum,
 				        &function_definition_p,
+					NULL,
 					NULL);
 
     /* 7.1.1-1 [dcl.stc]
@@ -31243,6 +31270,21 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
       cp_parser_cilk_simd_construct (parser, pragma_tok);
       return true;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (context == pragma_external)
+	{
+	  error_at (pragma_tok->location,
+		    "%<#pragma cilk grainsize%> may only be be used inside a "
+		    "function");
+	  break;
+	}
+
+      /* Ignore the pragma if Cilk Plus is not enabled.  */
+      if (flag_enable_cilkplus)
+	{
+	  cp_parser_cilk_grainsize (parser, pragma_tok);
+	  return true;
+	}
     default:
       gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
       c_invoke_pragma_handler (id);
@@ -31608,28 +31650,135 @@ cp_parser_simd_for_init_statement (cp_parser *parser, tree *init,
   return decl;
 }
 
+static tree
+cp_parser_cilk_for_init_statement (cp_parser *parser, tree *init)
+{
+  cp_token *token = cp_lexer_peek_token (parser->lexer);
+  location_t loc = token->location;
+  tree decl_init = NULL_TREE;
+  if (token->type == CPP_SEMICOLON)
+    {
+      error_at (loc, "expected induction variable");
+      return error_mark_node;
+    }
+
+  if (cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_REGISTER)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_EXTERN)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_MUTABLE)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_THREAD))
+    {
+      error_at (loc, "storage class is not allowed");
+      cp_lexer_consume_token (parser->lexer);
+    }
+
+  if (token->type == CPP_NAME)
+    {
+      tree type = cp_parser_lookup_name_simple (parser, token->u.value, loc);
+      if (TREE_CODE (type) == VAR_DECL || TREE_CODE (type) == PARM_DECL)
+	{
+	  error_at (loc, "_Cilk_for loop initializer must declare variable");
+	  cp_parser_skip_to_end_of_statement (parser);
+	  return error_mark_node;
+	}
+    }
+  int flags = 0;
+  cp_decl_specifier_seq specs;
+  cp_parser_decl_specifier_seq (parser, CP_PARSER_FLAGS_NONE, &specs, &flags);
+  tree decl = cp_parser_init_declarator (parser, &specs, NULL, false, false,
+					 flags, NULL, NULL, &decl_init);
+  /* Sometimes if the initial is constant, it won't save in DECL_INITIAL,
+     and thus we need to get the initial value.  Now, if it saved the
+     DECL_INITIAL value, then just use it since it will have all the
+     necessary type casting.  */
+  if (DECL_INITIAL (decl))
+      decl_init = DECL_INITIAL (decl);
+
+  
+  if (processing_template_decl)
+    add_stmt (decl_init);
+  else
+    *init = decl_init;
+  parser->scope = NULL_TREE;
+  parser->qualifying_scope = NULL_TREE;
+  parser->object_scope = NULL_TREE;
+
+  if (decl == error_mark_node || DECL_INITIAL (decl) == error_mark_node
+      || TREE_TYPE (decl) == error_mark_node)
+    {
+      cp_parser_skip_to_end_of_statement (parser);
+      gcc_assert (errorcount || sorrycount);
+      return error_mark_node;
+    }
+  return decl;
+}
+
+static void
+cp_parser_cilk_grainsize (cp_parser *parser, cp_token *pragma_tok)
+{
+  if (cp_parser_require (parser, CPP_EQ, RT_EQ))
+    {
+      tree exp = cp_parser_binary_expression (parser, false, false,
+					      PREC_NOT_OPERATOR, NULL);
+      cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+      if (!exp || exp == error_mark_node)
+	{
+	  error_at (pragma_tok->location, "invalid grainsize for _Cilk_for");
+	  return;
+	}
+      cp_token *n_tok = cp_lexer_peek_token (parser->lexer);
+
+      /* Make sure the next token is _Cilk_for, it is invalid otherwise.  */
+      if (n_tok && n_tok->type == CPP_KEYWORD && n_tok->keyword == RID_CILK_FOR)
+	{
+	  cp_lexer_consume_token (parser->lexer);
+	  tree cfor = cp_parser_cilk_for (parser, RID_CILK_FOR, exp);
+	  if (cfor && STATEMENT_CODE_P (TREE_CODE (cfor)))
+	    SET_EXPR_LOCATION (cfor, n_tok->location);
+	}
+      else
+	warning (0, "%<#pragma cilk grainsize%> is not followed by "
+		 "%<_Cilk_for%>");
+      return;
+    }
+  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+}
+	      
+
 /* Top-level function to parse _Cilk_for and the for statement
    following <#pragma simd>.  */
 
 static tree
-cp_parser_cilk_for (cp_parser *parser, enum rid for_keyword, tree clauses)
+cp_parser_cilk_for (cp_parser *parser, enum rid for_keyword,
+		    tree clauses_or_grain)
 {
   bool valid = true;
   tree cond = NULL_TREE;
   tree incr_expr = NULL_TREE;
   tree init = NULL_TREE, pre_body = NULL_TREE, decl;
   location_t loc = cp_lexer_peek_token (parser->lexer)->location;
-  
-  if (!cp_lexer_next_token_is_keyword (parser->lexer, for_keyword))
+  tree clauses = (for_keyword == RID_FOR) ? clauses_or_grain : NULL_TREE;
+  tree grain = (for_keyword == RID_CILK_FOR) ? clauses_or_grain: NULL_TREE;
+  tree statement = NULL_TREE;
+
+  /* If the RID_KEYWORD is RID_CILK_FOR, then it would already be consumed
+     by the parser.  So, this check will fail.  */
+  if (for_keyword == RID_FOR)
     {
-      if (for_keyword == RID_FOR)
-	cp_parser_error (parser, "for statement expected");
+      if (!cp_lexer_next_token_is_keyword (parser->lexer, for_keyword))
+	{
+	  cp_parser_error (parser, "for statement expected");
+	  return error_mark_node;
+	}
       else
-	cp_parser_error (parser, "_Cilk_for statement expected");
-      return error_mark_node;
+	cp_lexer_consume_token (parser->lexer);
     }
-  cp_lexer_consume_token (parser->lexer);
-
+  if (for_keyword == RID_CILK_FOR)
+    {
+      tree scope = begin_for_scope (&init);
+      statement = begin_cilk_for_stmt (scope, init);
+    }
+      
   if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
     {
       cp_parser_skip_to_end_of_statement (parser);
@@ -31639,7 +31788,9 @@ cp_parser_cilk_for (cp_parser *parser, enum rid for_keyword, tree clauses)
   /* Parse initialization.  */
   if (for_keyword == RID_FOR)
     decl = cp_parser_simd_for_init_statement (parser, &init, &pre_body);
-
+  else
+    decl = cp_parser_cilk_for_init_statement (parser, &init);
+    
   if (decl == error_mark_node)
     valid = false;
   else if (!decl || (TREE_CODE (decl) != VAR_DECL
@@ -31663,6 +31814,12 @@ cp_parser_cilk_for (cp_parser *parser, enum rid for_keyword, tree clauses)
       /* Skip to the semicolon ending the init.  */
       cp_parser_skip_to_end_of_statement (parser);
     }
+  else if (for_keyword == RID_CILK_FOR)
+    {
+      CILK_FOR_INIT (statement) = init;
+      CILK_FOR_VAR (statement) = decl;
+      finish_cilk_for_init_stmt (statement);
+    }
 
   /* Parse condition.  */
   if (!cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON))
@@ -31673,9 +31830,11 @@ cp_parser_cilk_for (cp_parser *parser, enum rid for_keyword, tree clauses)
       cond = error_mark_node;
     }
   else
-    {
+    { 
       cond = cp_parser_condition (parser);
       cond = finish_cilk_for_cond (cond);
+      if (for_keyword == RID_CILK_FOR)
+	CILK_FOR_COND (statement) = cond;
     }
 
   if (cond == error_mark_node)
@@ -31690,13 +31849,11 @@ cp_parser_cilk_for (cp_parser *parser, enum rid for_keyword, tree clauses)
     }
   else
     incr_expr = cp_parser_expression (parser, false, NULL);
-  
-  if (incr_expr == error_mark_node)
+  if (TREE_CODE (incr_expr) == ERROR_MARK)
     {
       cp_parser_skip_to_closing_parenthesis (parser, true, false, false);
       valid = false;
     }
-
   if (!cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
     {
       cp_parser_skip_to_end_of_statement (parser);
@@ -31708,7 +31865,7 @@ cp_parser_cilk_for (cp_parser *parser, enum rid for_keyword, tree clauses)
       gcc_assert (sorrycount || errorcount);
       return error_mark_node;
     }
-
+  
   if (for_keyword == RID_FOR)
     {
       parser->in_statement = IN_CILK_P_SIMD_FOR;
@@ -31728,9 +31885,20 @@ cp_parser_cilk_for (cp_parser *parser, enum rid for_keyword, tree clauses)
     }
   else
     {
-      /* Handle _Cilk_for here when implemented.  */
-      gcc_unreachable ();
-      return NULL_TREE;
+      finish_for_expr (incr_expr, statement);
+      CILK_FOR_EXPR (statement) = incr_expr;
+      int saved_in_statement = parser->in_statement;
+      parser->in_statement = IN_CILK_FOR_STMT;
+      cp_parser_already_scoped_statement (parser);
+      parser->in_statement = saved_in_statement;
+      /* Check if the body satisfies all the requirement of _Cilk_for.
+	 If invalid, then just return error_mark_node.  */
+      CILK_FOR_GRAIN (statement) = grain;
+      statement = finish_cilk_for_stmt (statement);
+      if (statement == error_mark_node
+	  || !cpp_validate_cilk_plus_loop (CILK_FOR_BODY (statement)))
+	return error_mark_node;
+      return statement;
     }
 }
 
diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
index 093ca41..a0ad750 100644
--- a/gcc/cp/parser.h
+++ b/gcc/cp/parser.h
@@ -301,7 +301,9 @@ typedef struct GTY(()) cp_parser {
 #define IN_OMP_FOR		8
 #define IN_IF_STMT             16
 #define IN_CILK_P_SIMD_FOR     32 
 #define IN_CILK_SPAWN          64 
+#define IN_CILK_FOR_STMT       128
+  
   unsigned char in_statement;
 
   /* TRUE if we are presently parsing the body of a switch statement.
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 3357966..e7262ad 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13328,6 +13328,45 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
       finish_for_stmt (stmt);
       break;
 
+    case CILK_FOR_STMT:
+      {
+	stmt = begin_cilk_for_stmt (NULL_TREE, NULL_TREE);
+	CILK_FOR_INIT (stmt) = RECUR (CILK_FOR_INIT (t));
+	finish_cilk_for_init_stmt (stmt);
+	tmp = RECUR (CILK_FOR_VAR (t));
+	CILK_FOR_VAR (stmt) = tmp;
+	CILK_FOR_GRAIN (stmt) = CILK_FOR_GRAIN (t);
+
+	tmp = CILK_FOR_COND (t);
+	if (COMPARISON_CLASS_P (tmp))
+	  {
+	    tree op0 = RECUR (TREE_OPERAND (tmp, 0));
+	    tree op1 = RECUR (TREE_OPERAND (tmp, 1));
+	    tmp = build2 (TREE_CODE (tmp), boolean_type_node, op0, op1);
+	  }
+	CILK_FOR_COND (stmt) = tmp;
+
+	tmp = CILK_FOR_EXPR (t);
+	if (TREE_CODE (tmp) == MODIFY_EXPR)
+	  {
+	    tree lhs = TREE_OPERAND (tmp, 0);
+	    tree rhs = TREE_OPERAND (tmp, 1);
+	    lhs = RECUR (lhs);
+	    rhs = build2 (TREE_CODE (rhs), TREE_TYPE (lhs),
+			  RECUR (TREE_OPERAND (rhs, 0)),
+			  RECUR (TREE_OPERAND (rhs, 1)));
+	    tmp = build2 (MODIFY_EXPR, void_type_node, lhs, rhs);
+	  }
+	else
+	  tmp = build2 (TREE_CODE (tmp), void_type_node,
+			RECUR (TREE_OPERAND (tmp, 0)),
+			RECUR (TREE_OPERAND (tmp, 1)));
+	finish_for_expr (tmp, stmt);
+	RECUR (CILK_FOR_BODY (t));
+	stmt = finish_cilk_for_stmt (stmt);
+	CILK_FOR_GRAIN (stmt) = RECUR (CILK_FOR_GRAIN (t));	
+	break;
+      }
     case RANGE_FOR_STMT:
       {
         tree decl, expr;
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index c03cdc5..03e72a1 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -825,7 +825,8 @@ finish_return_stmt (tree expr)
   return r;
 }
 
-/* Begin the scope of a for-statement or a range-for-statement.
+/* Begin the scope of a for-statement _Cilk_for statement 
+   or a range-for-statement.
    Both the returned trees are to be used in a call to
    begin_for_stmt or begin_range_for_stmt.  */
 
@@ -898,7 +899,7 @@ finish_for_cond (tree cond, tree for_stmt, bool ivdep)
 }
 
 /* Finish the increment-EXPRESSION in a for-statement, which may be
-   given by FOR_STMT.  */
+   given by FOR_STMT or CILK_FOR_STMT.  */
 
 void
 finish_for_expr (tree expr, tree for_stmt)
@@ -925,7 +926,10 @@ finish_for_expr (tree expr, tree for_stmt)
   expr = maybe_cleanup_point_expr_void (expr);
   if (check_for_bare_parameter_packs (expr))
     expr = error_mark_node;
-  FOR_EXPR (for_stmt) = expr;
+  if (TREE_CODE (for_stmt) == CILK_FOR_STMT)
+    CILK_FOR_EXPR (for_stmt) = expr;
+  else
+    FOR_EXPR (for_stmt) = expr;
 }
 
 /* Finish the body of a for-statement, which may be given by
@@ -6666,7 +6670,10 @@ finish_omp_cancellation_point (tree clauses)
 tree
 finish_cilk_for_cond (tree cond)
 {
-  return cp_truthvalue_conversion (cond);
+  if (!processing_template_decl)
+    return cp_truthvalue_conversion (cond);
+  else
+    return cond;
 }
 \f
 /* Begin a __transaction_atomic or __transaction_relaxed statement.
@@ -10621,4 +10628,51 @@ capture_decltype (tree decl)
   return type;
 }
 
+/* Begin a _Cilk_for-statement.  Returns a new FOR_STMT.  
+   SCOPE and INIT should be the return of begin_for_scope, 
+   or both NULL_TREE  */
+
+tree
+begin_cilk_for_stmt (tree scope, tree init)
+{
+  tree cilk_for_stmt = build_stmt (input_location, CILK_FOR_STMT, NULL_TREE,
+				   NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE,
+				   NULL_TREE, NULL_TREE);
+  if (scope == NULL_TREE)
+    {
+      if (!init)
+	scope = begin_for_scope (&init);
+    }
+  CILK_FOR_INIT (cilk_for_stmt) = init;
+  CILK_FOR_SCOPE (cilk_for_stmt) = scope;
+  return cilk_for_stmt;
+}
+
+/* Finish the for-init-statement of a for-statement, which may be given 
+   by C_FOR_STMT.  */
+
+void
+finish_cilk_for_init_stmt (tree c_for_stmt)
+{
+  if (processing_template_decl)
+    CILK_FOR_INIT (c_for_stmt) = pop_stmt_list (CILK_FOR_INIT (c_for_stmt));
+  CILK_FOR_BODY (c_for_stmt) = do_pushlevel (sk_block);
+}
+
+/* Finish the body of a for-statement, which may be given by FOR_STMT.  
+   Returns a CILK_FOR_STMT that is type checked.  */
+
+tree
+finish_cilk_for_stmt (tree cilk_for_stmt)
+{
+  CILK_FOR_BODY (cilk_for_stmt) = do_poplevel (CILK_FOR_BODY (cilk_for_stmt));
+  tree *scope_ptr = &CILK_FOR_SCOPE (cilk_for_stmt);
+  tree scope = *scope_ptr;
+  *scope_ptr = NULL;
+  add_stmt (do_poplevel (scope));
+  cp_finish_cilk_for_loop (&cilk_for_stmt, processing_template_decl);
+  add_stmt (cilk_for_stmt);
+  return cilk_for_stmt;
+}
+
 #include "gt-cp-semantics.h"
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc
new file mode 100644
index 0000000..dec650c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc
@@ -0,0 +1,42 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+
+int j[10];
+
+int main(void)
+{
+  int error = 0;
+  int j_serial[10];
+  for (int ii = 0; ii < 10; ii++)
+    {
+      j[ii] = 10;
+      j_serial[ii] = 10;
+    }
+  _Cilk_for (int ii = 5; ii < 10; ii++)
+    {
+      j[ii]=ii;
+    }
+
+  for (int ii = 5; ii < 10; ii++)
+    {
+      j_serial[ii] = ii;
+    }
+
+  for (int ii = 0; ii < 10; ii++)
+    {
+      if (j[ii] != j_serial[ii]) 
+	error = 1;    
+    }
+
+  if (error)
+    __builtin_abort ();
+  else
+    return 0;
+
+  return j[9];
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
new file mode 100644
index 0000000..8221371
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
@@ -0,0 +1,25 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define SIZE 100
+#define CHECK_VALUE 5
+
+template <class T>
+int func (T start, T end)
+{
+  int Array[SIZE];
+  _Cilk_for (T ii = 0; ii < end; ii++)
+    Array[ii] = CHECK_VALUE;
+  
+  for (T ii = 0; ii < end; ii++)
+    if (Array[ii] != CHECK_VALUE)
+      __builtin_abort ();
+
+  return 0;
+}
+
+int main (void)
+{
+  return func <int> (0, 100) + func <long> (0, 100);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for.cc
new file mode 100644
index 0000000..30ea29d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for.cc
@@ -0,0 +1,34 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int main(int argc, char **argv)
+{
+  char Array1[26], Array2[26];
+  char Array1_Serial[26], Array2_Serial[26];
+
+  for (int ii = 0; ii < 26; ii++)  
+    { 
+      Array1[ii] = 'A'+ii;
+      Array1_Serial[ii] = 'A'+ii;
+    }
+  for (int ii = 0; ii < 26; ii++)
+    {
+      Array2[ii] = 'a'+ii;
+      Array2_Serial[ii] = 'a'+ii;
+    }
+
+  _Cilk_for (int ii = 0 ; ii < 26; ii++) 
+    Array1[ii] = Array2[ii];
+
+  for (int ii = 0; ii < 26; ii++)
+    Array1_Serial[ii] = Array2_Serial[ii];
+
+  for (int ii = 0; ii < 26; ii++)  {
+    if (Array1_Serial[ii] != Array1[ii])  { 
+	__builtin_abort ();
+    }
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc
new file mode 100644
index 0000000..3759a36
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc
@@ -0,0 +1,22 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int q[10], seq[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii=max - 1; ii>=start; ii--) 
+	{ 
+	  for (int jj = 0; jj < 10; jj++)  
+	    {
+	      if (seq[jj] == 5)
+		continue;
+	      else
+		seq[jj] = 2;
+	    }
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc
new file mode 100644
index 0000000..38c4d51
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc
@@ -0,0 +1,19 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int q[10], seq2[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii=max - 1; ii>=start; ii--) 
+	{ 
+	  for (int jj = 0; jj < 10; jj++) 
+	    seq2[jj] = 5;
+	  continue;
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc
new file mode 100644
index 0000000..e68c700
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc
@@ -0,0 +1,18 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int q[10], seq2[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii = max - 1; ii >= start; ii--) 
+	{ 
+	  if (q[ii] != 0) 
+	    continue;
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc
new file mode 100644
index 0000000..17fd064
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc
@@ -0,0 +1,23 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int q[10], seq2[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii=max - 1; ii>=start; ii--) 
+	{ 
+	  int jj = 0;
+	  while (jj < 10)
+	    {
+	      seq2[jj] = 1;
+	      jj++;
+	    }
+	  continue;
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc
new file mode 100644
index 0000000..f0ad2a3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc
@@ -0,0 +1,42 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <assert.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <vector>
+#include <list>
+#if HAVE_IO 
+#include <stdio.h>
+#endif
+#define NUMBER 500
+#include <stdlib.h>
+typedef std::pair<int, int> my_type_t;
+
+long
+valid_pairs(std::vector< my_type_t > my_list) 
+{
+  _Cilk_for (int ii = 0; ii < my_list.size(); ii++) 
+    {
+#if HAVE_IO
+    fprintf(stderr, "my_list index: %d, size: %zu.\n", ii, my_list.size());
+#endif
+      if (ii < 0 || ii >= my_list.size())
+	__builtin_abort (); 
+    }
+  return 0;
+}
+
+int main(int argc, char **argv) 
+{
+  std::vector<my_type_t> my_list;
+
+  for (int ii = 0; ii < NUMBER; ii++) 
+    my_list.push_back(my_type_t(ii, ii));
+  long res = valid_pairs(my_list);
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_grainsize.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_grainsize.cc
new file mode 100644
index 0000000..7d54828
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_grainsize.cc
@@ -0,0 +1,77 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int x = 5;
+int q = 25;
+int z = 2;
+
+int square (int b)
+{
+  return (b*b);
+}
+
+template<class T>
+int templated_func (T a, T b, T c)
+{
+  T Array[10];
+#pragma cilk grainsize = a
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = a;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != a)
+      __builtin_abort ();
+
+#pragma cilk grainsize = square ((int) (b/c))
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = b;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != b)
+      __builtin_abort ();
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = c;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != c)
+      __builtin_abort ();
+
+  return 0;
+}
+
+ 
+
+int main (void)
+{
+  int Array[10];
+#pragma cilk grainsize = 5
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != 5)
+      __builtin_abort ();
+
+
+#pragma cilk grainsize = x
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 10;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != 10)
+      __builtin_abort ();
+
+#pragma cilk grainsize = square (z)
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 15;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != 15)
+      __builtin_abort ();
+
+  int r = 5, s=10, t =15;
+  return templated_func (r, s, t);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_p_errors.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_p_errors.cc
new file mode 100644
index 0000000..4c69712
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_p_errors.cc
@@ -0,0 +1,52 @@
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+
+int main (void)
+{
+  int a, iii = 0;
+  _Cilk_for (; iii < 10; iii++) /* { dg-error "expected induction variable" } */
+    a = 5;
+
+  _Cilk_for (iii = 0; iii < 10; iii++) /* { dg-error " must declare variable" } */
+    a = 5;
+
+  _Cilk_for (int qq = 0, jj = 0; qq < 10; qq++) /* { dg-error " initializer cannot have multiple variable declarations" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0, int jj = 0; ii < 10; ii++) /* { dg-error " initializer cannot have multiple variable declarations" } */
+    a = 5;
+
+  _Cilk_for (int rr = 0; ; rr++) /* { dg-error "missing condition" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii = 5; ii++) /* { dg-error "invalid controlling predicate" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii == 5; ii++) /* { dg-error "invalid controlling predicate" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii < 10;) /* { dg-error "missing increment" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii ) /* { dg-error "invalid increment expression" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      a = 5;
+      if (ii == 5)
+	break; /* { dg-error "break statement used in _Cilk_for loop body" } */
+    }
+
+#pragma cilk grainsize 5 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    a = 5;
+
+#pragma Silk grainsize = 5 /* { dg-warning "ignoring #pragma Silk grainsize" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    a = 5;
+#pragma cilk grainsiz = 5 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    a = 5;
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_t_errors.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_t_errors.cc
new file mode 100644
index 0000000..3d1914a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_t_errors.cc
@@ -0,0 +1,30 @@
+/* { dg-options "-fcilkplus" } */
+
+#include <setjmp.h>
+int main (void)
+{
+  int a, iii = 0;
+
+  _Cilk_for (volatile int ii = 0; ii < 10; ii++) /* { dg-error "induction variable cannot be volatile" } */
+    a = 5;
+
+  _Cilk_for (static int ii = 0; ii < 10; ii++) /* { dg-error "storage class is not allowed" } */
+    a = 5;
+  _Cilk_for (register int ii = 0; ii < 10; ii++) /* { dg-error "storage class is not allowed" } */
+    a = 5;
+
+  _Cilk_for (extern int ii = 0; ii < 10; ii++) /* { dg-error "storage class is not allowed" } */
+    a = 5;
+
+  _Cilk_for (float ii = 0.0; ii < 10.0; ii += 0.5) /* { dg-error "induction variable must be of integral record or pointer type" } */
+    a = 5;
+
+  jmp_buf env;
+  _Cilk_for (int ii = 0; ii < 10; ii++) 
+    {
+      a = 5;
+      setjmp (env); /* { dg-error "calls to setjmp are not allowed within" } */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/explicit_ctor.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/explicit_ctor.cc
new file mode 100644
index 0000000..89f6403
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/explicit_ctor.cc
@@ -0,0 +1,27 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+struct BruceBoxleitner {
+    int m;
+    BruceBoxleitner (int n = 0) : m(n) { }
+    BruceBoxleitner operator--() { --m; return *this; }
+};
+
+int operator- (BruceBoxleitner a, BruceBoxleitner b) { return a.m - b.m; }
+
+struct BruceLee {
+    int m;
+    explicit BruceLee (int n) : m(n) { }
+};
+
+bool operator> (BruceBoxleitner a, BruceLee b) { return a.m > b.m; }
+int operator- (BruceBoxleitner a, BruceLee b) { return a.m - b.m; }
+
+int main () {
+    _Cilk_for (BruceBoxleitner i = 10; i > BruceLee(0); --i)
+      ;
+    return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/label_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/label_test.cc
new file mode 100644
index 0000000..495e9b4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/label_test.cc
@@ -0,0 +1,26 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int main(void)
+{
+  int jj = 0;
+  int total = 0;
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if ((ii % 2) == 0)
+	goto hello_label;
+      else
+	goto world_label;
+
+hello_label:
+     total++;
+world_label:
+     total++;
+    }
+  if (total != 15)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/no-opp-overload-error.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/no-opp-overload-error.cc
new file mode 100644
index 0000000..582ef60
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/no-opp-overload-error.cc
@@ -0,0 +1,88 @@
+/* { dg-options "-fcilkplus" } */
+
+
+#define NUMBER_OF_ELEMENTS 10
+
+#include <cstdlib>
+
+class my_class {
+private:
+  int value;
+public:
+
+  my_class ();
+  my_class (const my_class &val);
+  my_class (my_class &val);
+  my_class (int val);
+  ~my_class ();
+  int getValue();
+  my_class &operator= (my_class &new_value);
+  my_class &operator= (int x);
+  my_class &operator+= (int val)
+  {
+    value += val;
+    return *this;
+  }
+  bool operator< (const my_class &val)
+  {
+    return (value < val.value);
+  }
+};
+
+
+my_class::my_class ()
+{
+  value = 0;
+}
+
+my_class::my_class(int val)
+{
+  value = val;
+}
+
+my_class::my_class (my_class &val)
+{
+  value = val.value;
+}
+
+my_class::my_class (const my_class &val)
+{
+  value = val.value;
+}
+
+my_class::~my_class ()
+{
+  value = -1;
+}
+
+int my_class::getValue ()
+{
+  return value;
+}
+
+my_class & my_class::operator= (my_class &new_value)
+{
+  value = new_value.value;
+  return *this;
+}
+
+my_class &my_class::operator= (int x)
+{
+  value = x;
+  return *this;
+}
+
+int main (void)
+{
+  int n, *array_parallel;
+  my_class length (NUMBER_OF_ELEMENTS);
+    n = NUMBER_OF_ELEMENTS;
+  
+  array_parallel = new int[NUMBER_OF_ELEMENTS];
+  _Cilk_for (my_class ii (0); ii < length; ii += 1) { /* { dg-error " No operator-" } */
+      int x = ii.getValue();
+    array_parallel [x] = x * 2;
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-one.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-one.cc
new file mode 100644
index 0000000..1326308
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-one.cc
@@ -0,0 +1,59 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#if HAVE_IO
+#include <cstdio>
+#endif
+#define TEST 1
+
+
+#define ITER 300
+
+int n_errors;
+#if TEST
+void test (int *array, int n, int val) {
+#if HAVE_IO
+    for (int i = 0; i < n; i++)
+      std::printf("array[%3d] = %2d\n", i, array[i]);
+#endif
+    for (int i = 0; i < n; ++i) {
+        if (array[i] != val) {
+           __builtin_abort (); 
+        }
+    }
+}
+#endif
+ 
+
+int main () {
+    int array[ITER];
+  
+    for (int ii = 0; ii < ITER; ii++)
+      array[ii] = 9;
+    _Cilk_for (int *j = (array); j < array + ITER; j += 1)  {
+       *j = 6; 
+    }
+#if TEST
+    test(array, ITER, 6);
+#endif
+
+    _Cilk_for (int *i = array; i < array + ITER; i += 1) {
+        *i = 1;
+    }
+
+#if TEST
+    test(array, ITER, 1);
+#endif
+
+    _Cilk_for (int *k = array+ITER-1; k >= array; k -= 1) {
+        *k = 8;
+    }
+#if TEST
+    test(array, ITER, 8);
+#endif
+  
+    return 0;
+
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-test.cc
new file mode 100644
index 0000000..0ca588d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-test.cc
@@ -0,0 +1,111 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#define NUMBER_OF_ELEMENTS 10
+
+#include <cstdlib>
+
+#if HAVE_IO
+#include <cstdio>
+#endif
+
+class my_class {
+private:
+  int value;
+public:
+
+  my_class ();
+  my_class (const my_class &val);
+  my_class (my_class &val);
+  my_class (int val);
+  ~my_class ();
+  int getValue();
+  my_class &operator= (my_class &new_value);
+  my_class &operator+= (int val)
+  {
+    value += val;
+    return *this;
+  }
+  bool operator< (const my_class &val)
+  {
+    return (value < val.value);
+  }
+};
+
+
+my_class::my_class ()
+{
+  value = 0;
+}
+
+my_class::my_class(int val)
+{
+  value = val;
+}
+
+my_class::my_class (my_class &val)
+{
+  value = val.value;
+}
+
+my_class::my_class (const my_class &val)
+{
+  value = val.value;
+}
+
+my_class::~my_class ()
+{
+  value = -1;
+}
+
+int my_class::getValue ()
+{
+  return value;
+}
+
+my_class & my_class::operator= (my_class &new_value)
+{
+  value = new_value.value;
+  return *this;
+}
+
+int operator- (my_class x, my_class y)
+{
+  int val_x = x.getValue ();
+  int val_y = y.getValue ();
+  return (val_x - val_y);
+}
+
+
+int main (void)
+{
+  int n, *array_parallel, *array_serial;
+  my_class length (NUMBER_OF_ELEMENTS);
+    n = NUMBER_OF_ELEMENTS;
+  
+  array_parallel = new int[NUMBER_OF_ELEMENTS];
+  array_serial = new int[NUMBER_OF_ELEMENTS];
+
+  _Cilk_for (my_class ii (0); ii < length; ii += 1) {
+#if HAVE_IO
+    std::printf("ii.getValue() = %d\n", ii.getValue ());
+#endif
+    array_parallel [ii.getValue ()] = ii.getValue() * 2;
+  }
+
+  for (my_class ii (0); ii < length; ii += 1)
+    array_serial [ii.getValue ()] = ii.getValue () * 2;
+  
+  for (int ii = 0; ii < NUMBER_OF_ELEMENTS; ii++)
+    if (array_serial[ii] != array_parallel[ii]) {
+#if HAVE_IO
+      std::printf("array_serial[%3d] = %6d\tarray_parallel[%3d] = %6d\n", ii,
+		  array_serial[ii], ii, array_parallel[ii]);
+#endif
+      __builtin_abort ();
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
new file mode 100644
index 0000000..e4f2ee5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
@@ -0,0 +1,58 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+int main(void)
+{
+vector <int> array;
+vector <int> array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back (ii);
+}
+#endif
+#if 1
+_Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end(); 
+	   iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+#endif
+for (vector<int>::iterator iter2 = array_serial.begin(); 
+     iter2 != array_serial.end(); iter2++)
+{
+   if (*iter2  == 6) 
+     *iter2 = 13;
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter3 = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+
+while (iter3 != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter3 != *iter_serial)
+    __builtin_abort ();
+  iter3++;
+  iter_serial++;
+}
+
+return 0;
+}   
+
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
new file mode 100644
index 0000000..3e350a1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
@@ -0,0 +1,50 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <iostream>
+#include <cstdio>
+#include <cstdlib>
+#include <vector>
+#include <algorithm>
+#include <list>
+
+using namespace std;
+
+
+int main(int argc, char **argv)
+{
+  vector <int> number_list, number_list_serial;
+  int new_number = 0;
+  int no_elements = 0;
+  
+  if (argc != 2)
+  {
+    no_elements = 10000;
+  }
+
+
+  number_list.clear();
+  number_list_serial.clear();
+  for (int ii = 0; ii < no_elements; ii++)
+  {
+    number_list.push_back(new_number);
+    number_list_serial.push_back(new_number);
+  }
+
+  _Cilk_for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list[jj] = jj + no_elements;
+  }
+  for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list_serial[jj] = jj + no_elements;
+  }
+
+  for (int jj = 0; jj < no_elements; jj++)
+    if (number_list_serial[jj] != number_list[jj])
+      __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
index 27412e8..ff5ea33 100644
--- a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
+++ b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
@@ -64,12 +64,10 @@ dg-finish
 
 dg-init
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O0 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O1 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O2 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O3 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -O0 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -O1 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -O2 -ftree-vectorize -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -O3 -fcilkplus" " "
@@ -77,7 +75,6 @@ dg-finish
 
 dg-init
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O0 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O1 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O2 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -fcilkplus" " "

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-15 21:45 [PATCH] _Cilk_for for C and C++ Iyer, Balaji V
@ 2013-11-16  1:38 ` Aldy Hernandez
  2013-11-19  1:11   ` Iyer, Balaji V
  2013-11-20  8:05 ` Aldy Hernandez
  2013-11-27 18:37 ` Jason Merrill
  2 siblings, 1 reply; 42+ messages in thread
From: Aldy Hernandez @ 2013-11-16  1:38 UTC (permalink / raw)
  To: Iyer, Balaji V
  Cc: gcc-patches, Jeff Law, Jason Merrill (jason@redhat.com), rth

On 11/15/13 12:23, Iyer, Balaji V wrote:

> This patch is dependent on the following patches:
>
> #pragma simd work (they both share the same parser routines)

I have just committed this to trunk, so it shouldn't be a blocker.

Also, in the past 2 days the #pragma simd parsing has been merged with 
the OpenMP parsing routines, so please adjust your patch accordingly.

Thanks.
Aldy

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2013-11-16  1:38 ` Aldy Hernandez
@ 2013-11-19  1:11   ` Iyer, Balaji V
  2013-11-22 19:45     ` Jason Merrill
  2013-11-27 23:55     ` Jeff Law
  0 siblings, 2 replies; 42+ messages in thread
From: Iyer, Balaji V @ 2013-11-19  1:11 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: gcc-patches, Jeff Law, Jason Merrill (jason@redhat.com), rth

[-- Attachment #1: Type: text/plain, Size: 9770 bytes --]

Hello Everyone,
     Please see my comment below:

> -----Original Message-----
> From: Aldy Hernandez [mailto:aldyh@redhat.com]
> Sent: Friday, November 15, 2013 4:51 PM
> To: Iyer, Balaji V
> Cc: gcc-patches@gcc.gnu.org; Jeff Law; Jason Merrill (jason@redhat.com);
> rth@redhat.com
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On 11/15/13 12:23, Iyer, Balaji V wrote:
> 
> > This patch is dependent on the following patches:
> >
> > #pragma simd work (they both share the same parser routines)
> 
> I have just committed this to trunk, so it shouldn't be a blocker.
> 
> Also, in the past 2 days the #pragma simd parsing has been merged with the
> OpenMP parsing routines, so please adjust your patch accordingly.

	Attached, please find a refreshed patches (one for C and 1 for C++).  The trunk was "diffed" after Aldy's check in of pragma simd was in. So, now this patch is only dependent on _Cilk_spawn and _Cilk_sync (mostly for execution of tests). They are tested on x86_64 and works successfully.

Here are the fixed Changelog entries (C related changelogs are given first then C++):

C- Related Changes
=====================================================================================
gcc/ChangeLog.
2013-11-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* cilk-builtins.def: Added 2 builtin functions: __cilkrts_cilk_for_64
	and __cilkrts_cilk_for_32.
	* cilk-common.c (cilk_declare_looper): New function.
	(cilk_init_builtins): Added two calls to cilk_declare_looper.
	* cilk.h (enum cilk_tree_index): Added two enums: CILK_TI_F_LOOP_32
	and CILK_TI_F_LOOP_64.
	(enum add_variable_type): Moved here from c-family/cilk.c
	(enum cilk_block_type): Likewise.
	(struct wrapper_data): Likewise.
	(struct cilk_for_desc): New struct.
	(cilk_for_32_fndecl): New #define.
	(cilk_for_64_fndecl): Likewise.
	* tree.h (CILK_FOR_INIT): Likewise.
	(CILK_FOR_COND): Likewise.
	(CILK_FOR_EXPR): Likewise.
	(CILK_FOR_BODY): Likewise.
	(CILK_FOR_SCOPE): Likewise.
	(CILK_FOR_GRAIN): Likewise.
	(CILK_FOR_VAR): Likewise.
	* gimplify.c (gimplify_expr): Added CILK_FOR_STMT case.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* langhooks-def.h (LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR): New
	#define.
	(LANG_HOOKS_CILKPLUS): Added LANG_HOOKS_CILKPLUS_GIMPLIFY_FOR field.
	* langhooks.h (struct lang_hooks_for_cilkplus): Added a new field
	gimplify_cilk_for.
	* tree.def: Added a new tree CILK_FOR_STMT.

gcc/c-family/ChangeLog.
2013-11-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-cilkplus.c (c_check_cilk_loop_incr): New function.
	(c_validate_cilk_plus_loop): Likewise.
	(c_check_cilk_loop): Likewise.
	(c_finish_cilk_for_loops): Likewise.
	(cp_finish_cilk_for_loops): Likewise.
	* c-common.c (c_common_resword): Added _Cilk_for keyword.
	* c-common.h (enum rid): Added RID_CILK_FOR.
	(cp_finish_cilk_for_loop): New prototype.
	(c_finish_cilk_for_loop): Likewise.
	(c_validate_cilk_loop): Likewise.
	(c_check_cilk_loop): Likewise.
	(cilk_init_fd): Likewise.
	(cilk_extract_free_variables): Likewise.
	(cilk_create_cilk_helper_decl): Likewise.
	(cilk_call_graph_add_fn): Likewise.
	(cilk_outline_body): Likewise.
	(cilk_check_loop_difference_type): Likewise.
	(declare_cilk_for_parms): Likewise.
	(declare_cilk_for_vars): Likewise.
	(cilk_loop_convert): Likewise.
	(cilk_divide_count): Likewise.
	(cilk_calc_forward_div_op): Likewise.
	(cilk_compute_loop_count): Likewise.
	(insert_cilk_for_nested_fn): Likewise.
	(cilk_compute_loop_var): Likewise.
	(cilk_set_inclusive_and_direction): Likewise.
	(cilk_set_iter_difftype): Likewise.
	(cilk_set_incr_info): Likewise.
	(cilk_set_init_info): Likewise.
	(clk_simplify_tree): Likewise.
	(cilk_find_code_from_call): Likewise.
	(cilk_tree_operand_noconv): Likewise.
	(cilk_resolve_continue_stmts): Likewise.
	* c-pragma.c (init_pragma): Added pragma grainsize.
	* c-pragma.h (enum pragma_kind): Added PRAGMA_CILK_GRAINSIZE.
	* cilk.c (enum add_variable_type): Moved to ../cilk.h.
	(enum cilk_block_type): Likewise.
	(struct wrapper_data): Likewise.
	(cilk_call_graph_add_fn): New function.
	(cilk_create_cilk_helper_decl): Likewise.
	(cilk_outline): Renamed to cilk_outline_body.  Also added a parameter
	to hold throw flag for C++.
	(cilk_create_wrapper_body): Renamed create_cilk_helper_decl,
	call_graph_add_fn and cilk_outline to cilk_create_cilk_helper_decl,
	cilk_call_graph_add_fn, and cilk_outline_body, respectively.
	(create_cilk_wrapper): Renamed extact_free_variables to
	cilk_extract_free_variables.
	(extract_free_variables): Likewise.
	(cilk_init_cfd): New function.
	(find_cilk_for_library_fn): Likewise.
	(cilk_compute_incr_direction): Likewise.
	(cilk_check_loop_difference_type): Likewise.
	(cilk_simplify_tree): Likewise.
	(declare_cilk_for_vars): Likewise.
	(declare_cilk_for_parms): Likewise.
	(cilk_loop_convert): Likewise.
	(cilk_divide_count): Likewise.
	(cilk_calc_forward_div_op): Likewise.
	(cilk_compute_loop_count): Likewise.
	(insert_cilk_for_nested_fn): Likewise.
	(cilk_compute_loop_var): Likewise.
	(cilk_tree_operand_noconv): Likewise.
	(cilk_find_code_from_call): Likewise.
	(cilk_set_init_info): Likewise.
	(cilk_set_inclusive_and_direction): Likewise.
	(cilk_set_iter_difftype): Likewise.
	(cilk_set_incr_info): Likewise.

gcc/c/ChangeLog.
2013-11-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* Make-lang.in (C_AND_OBJC_OBJS): Added c/c-cilk.o.
	* c-cilk.c: New file.
	* c-objc-common.h (LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR): New
	#define.
	* c-parser.c (c_parser_cilk_for_statement): New function prototype.
	(c_parser_cilk_grainsize): New function prototype and function.
	(c_parser_statement_after_labels): Added RID_CILK_FOR case.
	(c_parser_pragma): Added PRAGMA_CILK_GRAINSIZE case.
	(c_parser_cilk_for_statement): Renamed a parameter.  Added code to
	accomodate RID_CILK_FOR tree (i.e. to parse _Cilk_for statements).
	* c-tree.h (c_gimplify_cilk_for): New prototype.

gcc/testsuite/ChangeLog.
2013-11-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc: New test.
	* g++.dg/cilk-plus/CK/cilk-for-tplt.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk-for.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_grainsize.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_p_errors.cc: Likewise.
	* g++.dg/cilk-plus/CK/cilk_for_t_errors.cc: Likewise.
	* g++.dg/cilk-plus/CK/explicit_ctor.cc: Likewise.
	* g++.dg/cilk-plus/CK/label_test.cc: Likewise.
	* g++.dg/cilk-plus/CK/no-opp-overload-error.cc: Likewise.
	* g++.dg/cilk-plus/CK/plus-equal-one.cc: Likewise.
	* g++.dg/cilk-plus/CK/plus-equal-test.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_test.cc: Likewise.
	* g++.dg/cilk-plus/cilk-plus.exp: Added support to call _Cilk_for
	testcodes.


=============================================================================================

Here are the C++ related ChangeLogs:

gcc/cp/ChangeLog.
2013-11-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* cp-cilk.c: Added langhooks.h and tree.h.
	(callable): New function.
	(calc_count_up_count_down): Likewise.
	(compute_loop_var_cp_iter_hdl): Likewise.
	(cp_create_cilk_for_body): Likewise.
	(create_cilk_for_nested_fn): Likewise.
	(gimplify_cilk_for_1): Likewise.
	(cp_extract_cilk_for_fields): Likewise.
	(cp_gimplify_cilk_for): Likewise.
	* cp-gimplify.c (genericize_cilk_for_stmt): Likewise.
	(cp_genericize_r): Added a check for CILK_FOR_STMT.
	* cp-objcp-common.h (LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR): New
	#define.
	* cp-tree.h (begin_cilk_for_stmt): New prototype.
	(finish_cilk_for_stmt): Likewise.
	(finish_cilk_for_init_stmt): Likewise.
	(cp_gimplify_cilk_for): Likewise.
	* name-lookup.c (begin_scope): Added sk_cilk_for case.
	* name-lookup.h (enum scope_kind): Added sk_cilk_for.
	* parser.c (cp_parser_cilk_grainsize): New function and prototype.
	(cp_parser_init_declarator): Added a new parameter to hold the
	initial value.
	(cp_parser_statement): Added RID_CILK_FOR case.
	(cp_parser_iteration_statement): Likewise.
	(cp_parser_jump_statement): Added IN_CILK_FOR_STMT case (twice).
	(cp_parser_pragma): Added PRAGMA_CILK_GRAINSIZE case.
	(cp_parser_cilk_for_init_statement): New function.
	(cp_parser_cilk_for): Renamed a parameter and added support for
	parsing _Cilk_for loops that are part of Cilk keywords.
	* parser.h (IN_CILK_FOR_STMT): New #define.
	* pt.c (tsubst_expr): Added CILK_FOR_STMT case.
	* semantics.c (begin_for_scope): Added "_Cilk_for statement" in the
	header comment.
	(finish_for_expr): Added support for CILK_FOR_STMT to use this
	function.
	(finish_cilk_for_cond): Added support for processing templates.
	(begin_cilk_for_stmt): New function.
	(finish_cilk_for_init_stmt): Likewise.
	(finish_clk_for_stmt): Likewise.

gcc/testsuite/ChangeLog.
2013-11-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>
	* gcc.dg/cilk-plus/CK/cilk-for.c: New test.
	* gcc.dg/cilk-plus/CK/cilk_for_decr.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_errors.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_grain.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_ptr_iter.c: Likewise.
	* gcc.dg/cilk-plus/CK/cilk_for_warning.c: Likewise.
	* gcc.dg/cilk-plus/cilk-plus.exp: Added support to call _Cilk_for
	testcodes.

Thanks,

Balaji V. Iyer.

[-- Attachment #2: diff_cilk_for_c.txt --]
[-- Type: text/plain, Size: 104201 bytes --]

diff --git a/gcc/c-family/c-cilkplus.c b/gcc/c-family/c-cilkplus.c
index 6fa979d..010d39d 100644
--- a/gcc/c-family/c-cilkplus.c
+++ b/gcc/c-family/c-cilkplus.c
@@ -26,8 +26,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree.h"
 #include "c-common.h"
 
-/* Validate the body of a _Cilk_for construct or a <#pragma simd> for
-   loop.
+/* Validate the body of a <#pragma simd> for loop.
 
    Returns true if there were no errors, false otherwise.  */
 
@@ -91,3 +90,401 @@ c_finish_cilk_clauses (tree clauses)
     }
   return clauses;
 }
+
+/* Helper function for c_check_cilk_for_loop.
+
+   Validate the increment in a _Cilk_for construct or a <#pragma simd>
+   for loop.
+
+   LOC is the location of the `for' keyword.  DECL is the induction
+   variable.  INCR is the original increment expression.
+
+   Returns the canonicalized increment expression for an OMP_FOR_INCR.
+   If there is a validation error, returns error_mark_node.  */
+
+static tree
+c_check_cilk_loop_incr (location_t loc, tree decl, tree incr)
+{
+  tree orig_incr = incr;
+  if (!incr)
+    {
+      error_at (loc, "missing increment");
+      return error_mark_node;
+    }
+
+  if (EXPR_HAS_LOCATION (incr))
+    loc = EXPR_LOCATION (incr);
+
+  /* We hit this if-statement if we have an overloaded operand like
+     this: *my_class::operator+= (&ii, 1).  For example, see the testscase
+     plus-equal-test.cc.  */
+  if (TREE_CODE (incr) == INDIRECT_REF
+      || TREE_CODE (incr) == CLEANUP_POINT_EXPR)
+    incr = TREE_OPERAND (incr, 0);
+
+  if (TREE_CODE (incr) == TARGET_EXPR)
+    incr = TARGET_EXPR_INITIAL (incr);
+
+  switch (TREE_CODE (incr))
+    {
+    case POSTINCREMENT_EXPR:
+    case PREINCREMENT_EXPR:
+    case POSTDECREMENT_EXPR:
+    case PREDECREMENT_EXPR:
+      if (TREE_OPERAND (incr, 0) != decl)
+	break;
+
+      return incr;
+
+    case MODIFY_EXPR:
+      {
+	tree rhs;
+
+	if (TREE_OPERAND (incr, 0) != decl)
+	  break;
+
+	rhs = TREE_OPERAND (incr, 1);
+	if ((TREE_CODE (rhs) == PLUS_EXPR
+	     || TREE_CODE (rhs) == POINTER_PLUS_EXPR)
+	    && (TREE_OPERAND (rhs, 0) == decl
+		|| TREE_OPERAND (rhs, 1) == decl)
+	    && (INTEGRAL_TYPE_P (TREE_TYPE (rhs))
+		|| POINTER_TYPE_P (TREE_TYPE (rhs))))
+	  return incr;
+	else if (TREE_CODE (rhs) == MINUS_EXPR
+		 && TREE_OPERAND (rhs, 0) == decl
+		 && INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	  return incr;
+	// Otherwise fail because only PLUS_EXPR and MINUS_EXPR are
+	// allowed.
+	break;
+      }
+
+      /* We encounter CALL_EXPR in C++ when we have a case like this:
+	 operator+= (&ii, 1);  */
+    case CALL_EXPR:
+      {
+	enum tree_code code = cilk_find_code_from_call (CALL_EXPR_FN (incr));
+	if (code == POSTINCREMENT_EXPR || code == POSTDECREMENT_EXPR
+	    || code == PREINCREMENT_EXPR || code == PREDECREMENT_EXPR)
+	  {
+	    tree val = CALL_EXPR_ARG (incr, 0);
+	    if (TREE_CODE (val) == ADDR_EXPR
+		|| TREE_CODE (val) == INDIRECT_REF)
+	      val = TREE_OPERAND (val, 0);
+	    if (val != decl)
+	      break;
+	    return incr;
+	  }
+	for (int ii = 0; ii < call_expr_nargs (incr); ii++)
+	  {
+	    tree val = CALL_EXPR_ARG (incr, ii);
+	    if (TREE_CODE (val) == ADDR_EXPR)
+	      val = TREE_OPERAND (val, 0);
+	    if (val == decl)
+	      continue;
+	    else
+	      {
+		tree rhs = val;
+		if (TREE_CODE (rhs) == INTEGER_CST)
+		  return orig_incr;
+		if ((TREE_CODE (rhs) == PLUS_EXPR
+		     || TREE_CODE (rhs) == POINTER_PLUS_EXPR)
+		    && (TREE_OPERAND (rhs, 0) == decl
+			|| TREE_OPERAND (rhs, 1) == decl)
+		    && (INTEGRAL_TYPE_P (TREE_TYPE (rhs))
+			|| POINTER_TYPE_P (TREE_TYPE (rhs))))
+		  return orig_incr;
+		else if (TREE_CODE (rhs) == MINUS_EXPR
+			 && TREE_OPERAND (rhs, 0) == decl
+			 && INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+		  return orig_incr;
+	      }
+	  }
+      }
+
+    default:
+      break;
+    }
+
+  error_at (loc, "invalid increment expression");
+  return error_mark_node;
+}
+
+/* Callback for walk_tree to validate the body of a pragma simd loop
+   or _cilk_for loop.
+
+   This function is passed in as a function pointer to walk_tree.  *TP is
+   the current tree pointer, *WALK_SUBTREES is set to 0 by this function if
+   recursing into TP's subtrees is unnecessary. *DATA is a bool variable that
+   is set to false if an error has occured.  */
+
+tree
+c_validate_cilk_plus_loop (tree *tp, int *walk_subtrees, void *data)
+{
+  if (!tp || !*tp)
+    return NULL_TREE;
+
+  bool *valid = (bool *) data;
+
+  switch (TREE_CODE (*tp))
+    {
+    case CALL_EXPR:
+      {
+	tree fndecl = CALL_EXPR_FN (*tp);
+
+	if (TREE_CODE (fndecl) == ADDR_EXPR)
+	  fndecl = TREE_OPERAND (fndecl, 0);
+	if (TREE_CODE (fndecl) == FUNCTION_DECL)
+	  {
+	    if (setjmp_call_p (fndecl))
+	      {
+		error_at (EXPR_LOCATION (*tp),
+			  "calls to setjmp are not allowed within loops "
+			  "annotated with #pragma simd or _Cilk_for loops");
+		*valid = false;
+		*walk_subtrees = 0;
+	      }
+	  }
+	break;
+      }
+
+    case OMP_PARALLEL:
+    case OMP_TASK:
+    case OMP_FOR:
+    case OMP_SIMD:
+    case OMP_SECTIONS:
+    case OMP_SINGLE:
+    case OMP_SECTION:
+    case OMP_MASTER:
+    case OMP_ORDERED:
+    case OMP_CRITICAL:
+    case OMP_ATOMIC:
+    case OMP_ATOMIC_READ:
+    case OMP_ATOMIC_CAPTURE_OLD:
+    case OMP_ATOMIC_CAPTURE_NEW:
+      error_at (EXPR_LOCATION (*tp), "OpenMP statements are not allowed "
+		"within loops annotated with #pragma simd");
+      *valid = false;
+      *walk_subtrees = 0;
+      break;
+
+    default:
+      break;
+    }
+  return NULL_TREE;
+}  
+
+/* Validate the body of a _Cilk_for construct or a <#pragma simd> for
+   loop.
+
+   Returns true if there were no errors, false otherwise.  */
+
+static bool
+c_check_cilk_loop_body (tree body)
+{
+  bool valid = true;
+  walk_tree (&body, c_validate_cilk_plus_loop, (void *) &valid, NULL);
+  return valid;
+}
+
+/* Validate a _Cilk_for construct (or a #pragma simd for loop, which
+   has the same syntactic restrictions).
+
+   Returns TRUE if there were no errors, FALSE otherwise.
+
+   LOC is the location of the for.
+   DECL is the controlling variable.
+   COND is the condition.
+
+   INCRP is a pointer the increment expression (in case the increment
+   needs to be canonicalized).
+
+   BODY is the body of the LOOP.
+   SCAN_BODY is true if the body must be checked.  */
+
+static bool
+c_check_cilk_for_loop (location_t loc, tree decl, tree cond, tree *incrp, 
+		       tree body, bool scan_body, bool is_cpp, 
+		       bool proc_templates_p)
+{
+  tree incr = *incrp;
+
+  if (TREE_THIS_VOLATILE (decl))
+    {
+      error_at (loc, "iteration variable cannot be volatile");
+      return false;
+    }
+  if (TREE_STATIC (decl))
+    {
+      error_at (loc, "induction variable cannot be static");
+      return false;
+    }
+  if (DECL_REGISTER (decl))
+    {
+      error_at (loc, "induction variable cannot be declared register");
+      return false;
+    }
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (decl))
+      && !POINTER_TYPE_P (TREE_TYPE (decl)))
+    {
+      /* In C++ iterators are allowed.  */
+      if (!is_cpp)
+	{
+	  error_at (loc, "induction variable must be of integral "
+		    "or pointer type (have %qT)", TREE_TYPE (decl));
+	  return false;
+	}
+      /* If we are processing templates then these checks are done
+	 in pt.c.  */
+      else if (!proc_templates_p
+	       && TREE_CODE (TREE_TYPE (decl)) != RECORD_TYPE)
+	{
+	  error_at (loc, "induction variable must be of integral "
+		    "record or pointer type (have %qT)", TREE_TYPE (decl)); 
+	  return false;
+	}
+    }
+
+  /* Validate the condition.  */
+  if (!cond)
+    {
+      error_at (loc, "missing condition");
+      return false;
+    }
+  bool cond_ok = false;
+  if (TREE_CODE (cond) == NE_EXPR
+      || TREE_CODE (cond) == CALL_EXPR
+      || TREE_CODE (cond) == LT_EXPR
+      || TREE_CODE (cond) == LE_EXPR
+      || TREE_CODE (cond) == GT_EXPR
+      || TREE_CODE (cond) == GE_EXPR)
+    {
+      /* Comparison must either be:
+	   DECL <comparison_operator> EXPR
+	   EXPR <comparison_operator> DECL
+      */
+      if (decl == cilk_simplify_tree (TREE_OPERAND (cond, 0)))
+	cond_ok = true;
+      else if (decl == cilk_simplify_tree (TREE_OPERAND (cond, 1)))
+	{
+	  /* Canonicalize the comparison so the DECL is on the LHS.  */
+	  TREE_SET_CODE (cond,
+			 swap_tree_comparison (TREE_CODE (cond)));
+	  TREE_OPERAND (cond, 1) = TREE_OPERAND (cond, 0);
+	  TREE_OPERAND (cond, 0) = decl;
+	  cond_ok = true;
+	}
+    }
+
+  /* In C++ you can have cases like this: x < 5
+     where '<' is overloaded and so it is translated like this:
+     operator< (x, 5), and this is acceptable.  */
+  cond = cilk_simplify_tree (cond);
+  if (!cond_ok && is_cpp && TREE_CODE (cond) == CALL_EXPR)
+    {
+      if (call_expr_nargs (cond) < 2)
+	cond_ok = false;
+      for (int ii = 0; ii < call_expr_nargs (cond); ii++)
+	{
+	  tree val = cilk_simplify_tree (CALL_EXPR_ARG (cond, ii));
+	  if (TREE_CODE (val) == ADDR_EXPR)
+	    val = TREE_OPERAND (val, 0);
+	  else if (TREE_CODE (val) == TARGET_EXPR)
+	    val = TARGET_EXPR_INITIAL (val);
+	  if (decl == val)
+	    {
+	      cond_ok = true;
+	      break;
+	    }
+	}
+    }	
+  if (!cond_ok)
+    {
+      error_at (loc, "invalid controlling predicate");
+      return false;
+    }
+
+  /* Validate and canonicalize the increment.  */
+  incr = c_check_cilk_loop_incr (loc, decl, incr);
+  if (incr == error_mark_node)
+    return false;
+  *incrp = incr;
+
+  if (scan_body && !c_check_cilk_loop_body (body))
+    return false;
+
+  return true;
+}
+
+/* Validate and emit code for the _Cilk_for loop
+
+   LOC is the location of the location of the _Cilk_for.
+   DECL is the iteration variable.
+   INIT is the initialization expression.
+   COND is the controlling predicate.
+   INCR is the increment expression.
+   BODY is the body of the loop.
+   SCAN_BODY is true if the body of the loop must be verified.
+
+   Returns the generated statement.  */
+
+tree
+c_finish_cilk_for_loop (location_t loc, tree decl, tree init, tree cond,
+			tree incr, tree body, tree grain, bool is_cpp,
+			bool proc_templates_p)
+{
+  if (!c_check_cilk_for_loop (loc, decl, cond, &incr, body, true, is_cpp, 
+			      proc_templates_p))
+    return NULL;
+
+  /* In the case for "_Cilk_for (int i = 0...)", init will be a decl.  It
+     should have a DECL_INITIAL that we can turn into an assignment.  */
+  if (init == decl)
+    {
+      location_t rhs_loc = DECL_SOURCE_LOCATION (decl);
+      init = DECL_INITIAL (decl);
+      if (!init)
+	{
+	  error_at (rhs_loc, "%qE is not initialized", decl);
+	  init = integer_zero_node;
+	  return NULL;
+	}
+      init = build2 (INIT_EXPR, TREE_TYPE (decl), decl, init);
+      DECL_INITIAL (decl) = NULL;
+    }
+
+  tree t = make_node (CILK_FOR_STMT);
+  TREE_TYPE (t) = void_type_node;
+
+  init = build2 (INIT_EXPR, TREE_TYPE (decl), decl, init);
+  CILK_FOR_INIT (t) = init;
+  CILK_FOR_COND (t) = cond;
+  CILK_FOR_EXPR (t) = incr;
+  CILK_FOR_BODY (t) = body;
+  CILK_FOR_SCOPE (t) = NULL_TREE;
+  CILK_FOR_VAR (t) = decl;
+  CILK_FOR_GRAIN (t) = grain;
+
+  SET_EXPR_LOCATION (t, loc);
+  return add_stmt (t);
+}  
+
+/* Similar to c_finish_cilk_for_loop, but don't actually create the
+   CILK_FOR_STMT tree ad return it.  *CILK_FOR_STMT is the CILK_FOR_STMT 
+   tree and proc_templates_p is set if we are processing templates.  */
+
+void
+cp_finish_cilk_for_loop (tree *cilk_for_stmt, bool proc_templates_p)
+{
+  tree cfor = *cilk_for_stmt;
+  tree incr = CILK_FOR_EXPR (cfor);
+  if (!c_check_cilk_for_loop (EXPR_LOCATION (cfor),	CILK_FOR_VAR (cfor), 
+			      CILK_FOR_COND (cfor), &incr, 
+			      CILK_FOR_BODY (cfor), true, true, 
+			      proc_templates_p))
+    *cilk_for_stmt = error_mark_node;
+  else
+    CILK_FOR_EXPR (*cilk_for_stmt) = incr;
+}
+
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index d7d5cb2..e500b20 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -416,6 +416,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Complex",		RID_COMPLEX,	0 },
   { "_Cilk_spawn",      RID_CILK_SPAWN, 0 },
   { "_Cilk_sync",       RID_CILK_SYNC,  0 },
+  { "_Cilk_for",        RID_CILK_FOR,   0 },
   { "_Imaginary",	RID_IMAGINARY, D_CONLY },
   { "_Decimal32",       RID_DFLOAT32,  D_CONLY | D_EXT },
   { "_Decimal64",       RID_DFLOAT64,  D_CONLY | D_EXT },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
old mode 100644
new mode 100755
index b931fd6..507f62d
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -149,7 +149,7 @@ enum rid
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
   /* Cilk Plus keywords.  */
-  RID_CILK_SPAWN, RID_CILK_SYNC,
+  RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR,
   
   /* Objective-C ("AT" reserved words - they are only keywords when
      they follow '@')  */
@@ -524,6 +524,14 @@ struct GTY(()) c_language_function {
 
 #define building_stmt_list_p() (stmt_list_stack && !stmt_list_stack->is_empty())
 
+/* In c-cilkplus.c */
+extern void cp_finish_cilk_for_loop (tree *, bool);
+extern tree c_validate_cilk_plus_loop (tree *, int *, void *);
+extern bool c_check_cilk_loop (location_t, tree);
+extern tree c_finish_cilk_for_loop (location_t, tree, tree, tree, tree, tree,
+				    tree, bool, bool);
+
 /* Language-specific hooks.  */
 
 /* If non-NULL, this function is called after a precompile header file
@@ -1292,7 +1300,6 @@ extern enum stv_conv scalar_to_vector (location_t loc, enum tree_code code,
 /* In c-cilkplus.c  */
 extern tree c_finish_cilk_clauses (tree);
 extern tree c_validate_cilk_plus_loop (tree *, int *, void *);
-extern bool c_check_cilk_loop (location_t, tree);
 
 /* These #defines allow users to access different operands of the
    array notation tree.  */
@@ -1377,5 +1384,31 @@ extern tree build_cilk_spawn (location_t, tree);
 extern tree make_cilk_frame (tree);
 extern tree create_cilk_function_exit (tree, bool, bool);
 extern tree cilk_install_body_pedigree_operations (tree);
-
+extern void cilk_init_cfd (struct cilk_for_desc *);
+extern void cilk_extract_free_variables (tree, struct wrapper_data *, int);
+extern tree cilk_create_cilk_helper_decl (struct wrapper_data *);
+extern void cilk_call_graph_add_fn (tree);
+extern void cilk_outline_body (tree, tree *, struct wrapper_data *, bool *);
+extern tree cilk_check_loop_difference_type (tree);
+extern void declare_cilk_for_parms (struct cilk_for_desc *);
+extern void declare_cilk_for_vars (struct cilk_for_desc *, tree);
+extern tree cilk_loop_convert (tree, tree);
+extern tree cilk_divide_count (tree, enum tree_code, tree, bool, tree);
+extern void cilk_calc_forward_div_op (struct cilk_for_desc *, enum tree_code *,
+				      tree *);
+extern tree cilk_compute_loop_count (struct cilk_for_desc *, enum tree_code,
+				     tree, tree, tree);
+extern gimple_seq insert_cilk_for_nested_fn (struct cilk_for_desc *, tree,
+					     tree);
+extern tree cilk_compute_loop_var (struct cilk_for_desc *, tree, tree,
+				   tree (*)(location_t, enum tree_code, tree,
+					    tree, tree));
+extern void cilk_set_inclusive_and_direction (struct cilk_for_desc *);
+extern void cilk_set_iter_difftype (struct cilk_for_desc *);
+extern void cilk_set_incr_info (struct cilk_for_desc *, bool);
+extern void cilk_set_init_info (struct cilk_for_desc *);
+extern tree cilk_simplify_tree (tree);
+extern enum tree_code cilk_find_code_from_call (tree);
+extern tree cilk_tree_operand_noconv (tree);
+extern tree cilk_resolve_continue_stmts (tree *, int *, void *);
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index 029ab1e..7f5f3df 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1390,6 +1390,11 @@ init_pragma (void)
 
   cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false,
 				false);
+
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    cpp_register_deferred_pragma (parse_in, "cilk", "grainsize",
+				  PRAGMA_CILK_GRAINSIZE, true, false);
+
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, "pack", handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 5379b9e..ca8b190 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -55,6 +55,9 @@ typedef enum pragma_kind {
   /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
   PRAGMA_CILK_SIMD,
 
+  /* This pragma handles setting of grainsize for a _Cilk_for.  */
+  PRAGMA_CILK_GRAINSIZE,
+  
   PRAGMA_GCC_PCH_PREPROCESS,
   PRAGMA_IVDEP,
 
diff --git a/gcc/c-family/cilk.c b/gcc/c-family/cilk.c
old mode 100644
new mode 100755
index 165348f..a313816
--- a/gcc/c-family/cilk.c
+++ b/gcc/c-family/cilk.c
@@ -35,47 +35,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic.h"
 #include "cilk.h"
 
-enum add_variable_type {
-    /* Reference to previously-defined variable.  */
-    ADD_READ,
-    /* Definition of a new variable in inner-scope.  */
-    ADD_BIND,
-    /* Write to possibly previously-defined variable.  */
-    ADD_WRITE
-};
-
-enum cilk_block_type {
-    /* Indicates a _Cilk_spawn block.  30 was an arbitary number picked for 
-       ease of debugging.  */
-    CILK_BLOCK_SPAWN = 30,
-    /* Indicates _Cilk_for statement block.  */
-    CILK_BLOCK_FOR
-};
-
-struct wrapper_data
-{
-  /* Kind of function to be created.  */
-  enum cilk_block_type type;
-  /* Signature of helper function.  */
-  tree fntype;
-  /* Containing function.  */
-  tree context;
-  /* Disposition of all variables in the inner statement.  */
-  struct pointer_map_t *decl_map;
-  /* True if this function needs a static chain.  */
-  bool nested;
-  /* Arguments to be passed to wrapper function, currently a list.  */
-  tree arglist;
-  /* Argument types, a list.  */
-  tree argtypes;
-  /* Incoming parameters.  */
-  tree parms;
-  /* Outer BLOCK object.  */
-  tree block;
-};
-
-static void extract_free_variables (tree, struct wrapper_data *,
-				    enum add_variable_type);
 static HOST_WIDE_INT cilk_wrapper_count;
 
 /* Marks the CALL_EXPR or FUNCTION_DECL, FCALL, as a spawned function call
@@ -156,8 +115,8 @@ pop_cfun_to (tree outer)
 /* This function does whatever is necessary to make the compiler emit a newly 
    generated function, FNDECL.  */
 
-static void
-call_graph_add_fn (tree fndecl)
+void
+cilk_call_graph_add_fn (tree fndecl)
 {
   const tree outer = current_function_decl;
   struct function *f = DECL_STRUCT_FUNCTION (fndecl);
@@ -283,8 +242,8 @@ cilk_detect_spawn_and_unwrap (tree *exp0)
 /* This function will build and return a FUNCTION_DECL using information 
    from *WD.  */
 
-static tree
-create_cilk_helper_decl (struct wrapper_data *wd)
+tree
+cilk_create_cilk_helper_decl (struct wrapper_data *wd)
 {
   char name[20];
   if (wd->type == CILK_BLOCK_FOR)
@@ -452,6 +411,8 @@ for_local_cb (const void *k_v, void **vp, void *p)
   tree k = *(tree *) &k_v;
   tree v = (tree) *vp;
 
+  if (k == v)
+    return true;
   if (v == error_mark_node)
     *vp = copy_decl_no_change (k, (copy_body_data *) p);
   return true;
@@ -472,15 +433,18 @@ wrapper_local_cb (const void *k_v, void **vp, void *data)
   return true;
 }
 
-/* Alter a tree STMT from OUTER_FN to form the body of INNER_FN.  */
+/* Alter a tree STMT from OUTER_FN to form the body of INNER_FN.  THR is set
+   to true if the original function has exception enabled (only applicable for
+   C++ Cilk_for nested function).  This value is evaluated and then
+   passed back into cp_function_tree->can_throw.  */
 
-static void
-cilk_outline (tree inner_fn, tree *stmt_p, struct wrapper_data *wd)
+void
+cilk_outline_body (tree inner_fn, tree *stmt_p, struct wrapper_data *wd,
+		   bool *thr)
 {
   const tree outer_fn = wd->context;	      
   const bool nested = (wd->type == CILK_BLOCK_FOR);
   copy_body_data id;
-  bool throws;
 
   DECL_STATIC_CHAIN (outer_fn) = 1;
 
@@ -496,7 +460,7 @@ cilk_outline (tree inner_fn, tree *stmt_p, struct wrapper_data *wd)
   id.retvar = 0; 
   id.decl_map = wd->decl_map;
   id.copy_decl = nested ? copy_decl_no_change : copy_decl_for_cilk;
-  id.block = DECL_INITIAL (inner_fn);
+  id.block = 0; 
   id.transform_lang_insert_block = NULL;
 
   id.transform_new_cfg = true;
@@ -515,8 +479,10 @@ cilk_outline (tree inner_fn, tree *stmt_p, struct wrapper_data *wd)
   /* See if this function can throw or calls something that should
      not be spawned.  The exception part is only necessary if
      flag_exceptions && !flag_non_call_exceptions.  */
-  throws = false ;
+  bool throws = thr ? *thr : false;
   (void) walk_tree_without_duplicates (stmt_p, check_outlined_calls, &throws);
+  if (thr)
+    *thr = throws;
 }
 
 /* Generate the body of a wrapper function that assigns the
@@ -538,7 +504,7 @@ create_cilk_wrapper_body (tree stmt, struct wrapper_data *wd)
   /* Emit a function that takes WRAPPER_PARMS incoming and applies ARGS 
      (modified) to the wrapped function.  Return the wrapper and modified ARGS 
      to the caller to generate a function call.  */
-  fndecl = create_cilk_helper_decl (wd);
+  fndecl = cilk_create_cilk_helper_decl (wd);
   push_struct_function (fndecl);
   if (wd->nested && (wd->type == CILK_BLOCK_FOR))
     {
@@ -551,7 +517,7 @@ create_cilk_wrapper_body (tree stmt, struct wrapper_data *wd)
   for (p = wd->parms; p; p = TREE_CHAIN (p))
     DECL_CONTEXT (p) = fndecl;
 
-  cilk_outline (fndecl, &stmt, wd);
+  cilk_outline_body (fndecl, &stmt, wd, NULL);
   stmt = fold_build_cleanup_point_expr (void_type_node, stmt);
   gcc_assert (!DECL_SAVED_TREE (fndecl));
   lang_hooks.cilkplus.install_body_with_frame_cleanup (fndecl, stmt);
@@ -560,7 +526,7 @@ create_cilk_wrapper_body (tree stmt, struct wrapper_data *wd)
   pop_cfun_to (outer);
 
   /* Recognize the new function.  */
-  call_graph_add_fn (fndecl);
+  cilk_call_graph_add_fn (fndecl);
   return fndecl;
 }
 
@@ -702,14 +668,14 @@ create_cilk_wrapper (tree exp, tree *args_out)
      by spawn and the variable must remain in the outer function.  */
   if (TREE_CODE (exp) == INIT_EXPR)
     {
-      extract_free_variables (TREE_OPERAND (exp, 0), &wd, ADD_WRITE);
-      extract_free_variables (TREE_OPERAND (exp, 1), &wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (exp, 0), &wd, ADD_WRITE);
+      cilk_extract_free_variables (TREE_OPERAND (exp, 1), &wd, ADD_READ);
       /* TREE_TYPE should be void.  Be defensive.  */
       if (TREE_TYPE (exp) != void_type_node)
-	extract_free_variables (TREE_TYPE (exp), &wd, ADD_READ);
+	cilk_extract_free_variables (TREE_TYPE (exp), &wd, ADD_READ);
     }
   else
-    extract_free_variables (exp, &wd, ADD_READ);
+    cilk_extract_free_variables (exp, &wd, ADD_READ);
   pointer_map_traverse (wd.decl_map, declare_one_free_variable, &wd);
   wd.block = TREE_BLOCK (exp);
   if (!wd.block)
@@ -996,14 +962,14 @@ add_variable (struct wrapper_data *wd, tree var, enum add_variable_type how)
 
 /* Find the variables referenced in an expression T.  This does not avoid 
    duplicates because a variable may be read in one context and written in 
-   another.  HOW describes the context in which the reference is seen.  If 
+   another.  HOW_T describes the context in which the reference is seen.  If 
    NESTED is true a nested function is being generated and variables in the 
    original context should not be remapped.  */
 
-static void
-extract_free_variables (tree t, struct wrapper_data *wd,
-			enum add_variable_type how)
+void
+cilk_extract_free_variables (tree t, struct wrapper_data *wd, int how_t)
 {  
+  enum add_variable_type how = (enum add_variable_type) how_t;
   if (t == NULL_TREE)
     return;
 
@@ -1011,7 +977,7 @@ extract_free_variables (tree t, struct wrapper_data *wd,
   bool is_expr = IS_EXPR_CODE_CLASS (TREE_CODE_CLASS (code));
 
   if (is_expr)
-    extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
+    cilk_extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
 
   switch (code)
     {
@@ -1031,7 +997,7 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 
     case SSA_NAME:
       /* Currently we don't see SSA_NAME.  */
-      extract_free_variables (SSA_NAME_VAR (t), wd, how);
+      cilk_extract_free_variables (SSA_NAME_VAR (t), wd, how);
       return;
 
     case LABEL_DECL:
@@ -1053,12 +1019,12 @@ extract_free_variables (tree t, struct wrapper_data *wd,
     case NON_LVALUE_EXPR:
     case CONVERT_EXPR:
     case NOP_EXPR:
-      extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
       return;
 
     case INIT_EXPR:
-      extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_BIND);
-      extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_BIND);
+      cilk_extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
       return;
 
     case MODIFY_EXPR:
@@ -1067,8 +1033,8 @@ extract_free_variables (tree t, struct wrapper_data *wd,
     case POSTDECREMENT_EXPR:
     case POSTINCREMENT_EXPR:
       /* These write their result.  */
-      extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_WRITE);
-      extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_WRITE);
+      cilk_extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
       return;
 
     case ADDR_EXPR:
@@ -1079,9 +1045,9 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	 be addressable, and marking it modified will cause a spurious
 	 warning about writing the control variable.  */
       if (wd->type != CILK_BLOCK_SPAWN)
-	extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
       else 
-	extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_WRITE);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_WRITE);
       return;
 
     case ARRAY_REF:
@@ -1093,17 +1059,17 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	 is being accessed here.  As for ADDR_EXPR, don't do this
 	 in a nested loop, unless the access is to a fixed index.  */
       if (wd->type != CILK_BLOCK_FOR || TREE_CONSTANT (TREE_OPERAND (t, 1)))
-	extract_free_variables (TREE_OPERAND (t, 0), wd, how);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, how);
       else
-	extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
-      extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
-      extract_free_variables (TREE_OPERAND (t, 2), wd, ADD_READ);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_OPERAND (t, 2), wd, ADD_READ);
       return;
 
     case TREE_LIST:
-      extract_free_variables (TREE_PURPOSE (t), wd, ADD_READ);
-      extract_free_variables (TREE_VALUE (t), wd, ADD_READ);
-      extract_free_variables (TREE_CHAIN (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_PURPOSE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_VALUE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_CHAIN (t), wd, ADD_READ);
       return;
 
     case TREE_VEC:
@@ -1111,7 +1077,7 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	int len = TREE_VEC_LENGTH (t);
 	int i;
 	for (i = 0; i < len; i++)
-	  extract_free_variables (TREE_VEC_ELT (t, i), wd, ADD_READ);
+	  cilk_extract_free_variables (TREE_VEC_ELT (t, i), wd, ADD_READ);
 	return;
       }
 
@@ -1119,13 +1085,13 @@ extract_free_variables (tree t, struct wrapper_data *wd,
       {
 	unsigned ii = 0;
 	for (ii = 0; ii < VECTOR_CST_NELTS (t); ii++)
-	  extract_free_variables (VECTOR_CST_ELT (t, ii), wd, ADD_READ); 
+	  cilk_extract_free_variables (VECTOR_CST_ELT (t, ii), wd, ADD_READ); 
 	break;
       }
 
     case COMPLEX_CST:
-      extract_free_variables (TREE_REALPART (t), wd, ADD_READ);
-      extract_free_variables (TREE_IMAGPART (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_REALPART (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_IMAGPART (t), wd, ADD_READ);
       return;
 
     case BIND_EXPR:
@@ -1136,11 +1102,11 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	    add_variable (wd, decl, ADD_BIND);
 	    /* A self-referential initialization is no problem because
 	       we already entered the variable into the map as local.  */
-	    extract_free_variables (DECL_INITIAL (decl), wd, ADD_READ);
-	    extract_free_variables (DECL_SIZE (decl), wd, ADD_READ);
-	    extract_free_variables (DECL_SIZE_UNIT (decl), wd, ADD_READ);
+	    cilk_extract_free_variables (DECL_INITIAL (decl), wd, ADD_READ);
+	    cilk_extract_free_variables (DECL_SIZE (decl), wd, ADD_READ);
+	    cilk_extract_free_variables (DECL_SIZE_UNIT (decl), wd, ADD_READ);
 	  }
-	extract_free_variables (BIND_EXPR_BODY (t), wd, ADD_READ);
+	cilk_extract_free_variables (BIND_EXPR_BODY (t), wd, ADD_READ);
 	return;
       }
 
@@ -1148,17 +1114,17 @@ extract_free_variables (tree t, struct wrapper_data *wd,
       {
 	tree_stmt_iterator i;
 	for (i = tsi_start (t); !tsi_end_p (i); tsi_next (&i))
-	  extract_free_variables (*tsi_stmt_ptr (i), wd, ADD_READ);
+	  cilk_extract_free_variables (*tsi_stmt_ptr (i), wd, ADD_READ);
 	return;
       }
 
     case TARGET_EXPR:
       {
-	extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_BIND);
-	extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
-	extract_free_variables (TREE_OPERAND (t, 2), wd, ADD_READ);
+	cilk_extract_free_variables (TREE_OPERAND (t, 0), wd, ADD_BIND);
+	cilk_extract_free_variables (TREE_OPERAND (t, 1), wd, ADD_READ);
+	cilk_extract_free_variables (TREE_OPERAND (t, 2), wd, ADD_READ);
 	if (TREE_OPERAND (t, 3) != TREE_OPERAND (t, 1))
-	  extract_free_variables (TREE_OPERAND (t, 3), wd, ADD_READ);
+	  cilk_extract_free_variables (TREE_OPERAND (t, 3), wd, ADD_READ);
 	return;
       }
 
@@ -1172,32 +1138,32 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 
     case DECL_EXPR:
       if (TREE_CODE (DECL_EXPR_DECL (t)) != TYPE_DECL)
-	extract_free_variables (DECL_EXPR_DECL (t), wd, ADD_BIND);
+	cilk_extract_free_variables (DECL_EXPR_DECL (t), wd, ADD_BIND);
       return;
 
     case INTEGER_TYPE:
     case ENUMERAL_TYPE:
     case BOOLEAN_TYPE:
-      extract_free_variables (TYPE_MIN_VALUE (t), wd, ADD_READ);
-      extract_free_variables (TYPE_MAX_VALUE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_MIN_VALUE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_MAX_VALUE (t), wd, ADD_READ);
       return;
 
     case POINTER_TYPE:
-      extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
       break;
 
     case ARRAY_TYPE:
-      extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
-      extract_free_variables (TYPE_DOMAIN (t), wd, ADD_READ);
+      cilk_extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_DOMAIN (t), wd, ADD_READ);
       return;
 
     case RECORD_TYPE:
-      extract_free_variables (TYPE_FIELDS (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_FIELDS (t), wd, ADD_READ);
       return;
     
     case METHOD_TYPE:
-      extract_free_variables (TYPE_ARG_TYPES (t), wd, ADD_READ);
-      extract_free_variables (TYPE_METHOD_BASETYPE (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_ARG_TYPES (t), wd, ADD_READ);
+      cilk_extract_free_variables (TYPE_METHOD_BASETYPE (t), wd, ADD_READ);
       return;
 
     case AGGR_INIT_EXPR:
@@ -1210,8 +1176,8 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	    len = TREE_INT_CST_LOW (TREE_OPERAND (t, 0));
 
 	    for (ii = 0; ii < len; ii++)
-	      extract_free_variables (TREE_OPERAND (t, ii), wd, ADD_READ);
-	    extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
+	      cilk_extract_free_variables (TREE_OPERAND (t, ii), wd, ADD_READ);
+	    cilk_extract_free_variables (TREE_TYPE (t), wd, ADD_READ);
 	  }
 	break;
       }
@@ -1227,7 +1193,7 @@ extract_free_variables (tree t, struct wrapper_data *wd,
 	  /* Go through the subtrees.  We need to do this in forward order so
 	     that the scope of a FOR_EXPR is handled properly.  */
 	  for (i = 0; i < len; ++i)
-	    extract_free_variables (TREE_OPERAND (t, i), wd, ADD_READ);
+	    cilk_extract_free_variables (TREE_OPERAND (t, i), wd, ADD_READ);
 	}
     }
 }
@@ -1304,3 +1270,872 @@ build_cilk_sync (void)
   TREE_SIDE_EFFECTS (sync) = 1;
   return sync;
 }
+
+/* Zeros out all the fields in CFD.  */
+
+void
+cilk_init_cfd (struct cilk_for_desc *cfd)
+{
+  memset (cfd, 0, sizeof *cfd);
+  init_wd (&cfd->wd, CILK_BLOCK_FOR);
+}
+
+/* Returns a CALL_EXPR based on the TYPE_PRECISON of COUNT_TYPE.  */
+
+static tree
+find_cilk_for_library_fn (tree count_type)
+{
+  if (TYPE_PRECISION (count_type) == 32)
+    return cilk_for_32_fndecl;
+  else if (TYPE_PRECISION (count_type) == 64)
+    return cilk_for_64_fndecl;
+  else
+    gcc_unreachable ();
+}
+
+/* This function finds the direction INCR, the increment expression, of the
+   loop: Return 0 if the sign of INCR_DIRECTION is unknown,
+   +1 if the value is exactly +1,
+   +2 if the value is known to be positive, and
+   -2 if the value is known to be negative.  */
+
+static int
+cilk_compute_incr_direction (tree incr)
+{
+  if (TREE_CODE (incr) != INTEGER_CST)
+    return tree_expr_nonnegative_p (incr) ? 2 : 0;
+  else if (integer_onep (incr))
+    return 1;
+  else
+    return 2 * tree_int_cst_sgn (incr);
+}
+
+/* Return the count type based on TYPE of a Cilk for loop, or unsigned long if
+   there is no acceptable type.  */
+
+tree
+cilk_check_loop_difference_type (tree type)
+{
+  if ((TYPE_PRECISION (type) > TYPE_PRECISION (long_unsigned_type_node))
+      || (TYPE_MAIN_VARIANT (type) == long_long_integer_type_node)
+      || (TYPE_MAIN_VARIANT (type) == long_long_unsigned_type_node))
+    return long_long_unsigned_type_node;
+
+  return long_unsigned_type_node;
+}
+
+/* Removes unwanted wrappers from a tree, T.  */
+
+tree
+cilk_simplify_tree (tree t)
+{
+  extern tree tree_ssa_strip_useless_type_conversions (tree);
+
+  if (TREE_CODE (t) == CLEANUP_POINT_EXPR)
+    t = TREE_OPERAND (t, 0);
+  if (TREE_CODE (t) == NOP_EXPR)
+    t = TREE_OPERAND (t, 0);
+  if ((TREE_CODE (t) == CONVERT_EXPR) && (VOID_TYPE_P (TREE_TYPE (t)) != 0))
+    t = TREE_OPERAND (t, 0);
+
+  STRIP_USELESS_TYPE_CONVERSION (t);
+
+  return t;
+}
+
+/* Set up the variable mapping for the FNDECL and install parameters after
+   declaring the function and scanning the loop body's variable use.
+   Information about the _Cilk_for statement is stored in *CFD.  */
+
+void
+declare_cilk_for_vars (struct cilk_for_desc *cfd, tree fndecl)
+{
+  tree var2 = build_decl (cfd->loc, VAR_DECL, DECL_NAME (cfd->var),
+                     cfd->var_type);
+  DECL_CONTEXT (var2) = fndecl;
+  cfd->var2 = var2;
+
+  void **mapped = pointer_map_contains (cfd->wd.decl_map, cfd->var);
+  /* The loop control variable must be mapped.  */
+  gcc_assert (mapped);
+  const_tree t = (const_tree) *mapped;
+
+  /* The loop control variable may appear as mapped to itself
+     or mapped to integer_one_node depending on its type and
+     how it was modified.  */
+  if ((TREE_CODE (t) != INTEGER_CST) || (t == integer_one_node))
+    {
+      tree save_function = current_function_decl;
+      current_function_decl = DECL_CONTEXT (cfd->var);
+      warning (0, "loop body modifies control variable %qD", cfd->var);
+      current_function_decl = save_function;
+    }
+  *mapped = (void *) var2;
+
+  tree p = cfd->wd.parms;
+  DECL_ARGUMENTS (fndecl) = p;
+  do
+    {
+      DECL_CONTEXT (p) = fndecl;
+      p = TREE_CHAIN (p);
+    }
+  while (p);
+}
+
+/* Set up the signature and parameters of the _Cilk_for body function
+   before declaring the function using information stored in CFD.  */
+
+void
+declare_cilk_for_parms (struct cilk_for_desc *cfd)
+{
+  tree count_type = cfd->count_type;
+  tree ro_count = build_qualified_type (count_type, TYPE_QUAL_CONST);
+  tree ctx = build_decl (cfd->loc, PARM_DECL, NULL_TREE, ptr_type_node);
+  tree t1 = get_identifier ("__low");
+  tree min_parm = build_decl (cfd->loc, PARM_DECL, t1, ro_count);
+  tree t2 = get_identifier ("__high");
+  tree max_parm = build_decl (cfd->loc, PARM_DECL, t2, ro_count);
+
+  DECL_ARG_TYPE (max_parm) = count_type;
+  DECL_ARTIFICIAL (max_parm) = 1;
+  TREE_READONLY (max_parm) = 1;
+
+  DECL_ARG_TYPE (min_parm) = count_type;
+  DECL_ARTIFICIAL (min_parm) = 1;
+  TREE_READONLY (min_parm) = 1;
+
+  DECL_ARG_TYPE (ctx) = ptr_type_node;
+  DECL_ARTIFICIAL (ctx) = 1;
+  TREE_READONLY (ctx) = 1;
+
+  TREE_CHAIN (min_parm) = max_parm;
+  TREE_CHAIN (ctx) = min_parm;
+
+  tree types = tree_cons (NULL_TREE, TREE_TYPE (max_parm), void_list_node);
+  types = tree_cons (NULL_TREE, TREE_TYPE (min_parm), types);
+  types = tree_cons (NULL_TREE, TREE_TYPE (ctx), types);
+
+  cfd->min_parm = min_parm;
+  cfd->max_parm = max_parm;
+  cfd->wd.argtypes = types;
+  cfd->wd.arglist = NULL_TREE;
+  cfd->wd.parms = ctx;
+}
+
+/* Convert a loop, EXP, to the way required by _Cilk_for and sets it type as
+   indicated by TYPE.  */
+
+tree
+cilk_loop_convert (tree type, tree exp)
+{
+  enum tree_code code;
+  int inprec, outprec;
+  if (type == TREE_TYPE (exp))
+    return exp;
+  inprec = TYPE_PRECISION (TREE_TYPE (exp));
+  outprec = TYPE_PRECISION (type);
+  if (outprec > inprec && !TYPE_UNSIGNED (TREE_TYPE (exp)))
+    code = CONVERT_EXPR;
+  else
+    code = NOP_EXPR;
+  return fold_build1 (code, type, exp);
+}
+
+/* Returns the number of times a loop is divided.  */
+
+tree
+cilk_divide_count (tree count, enum tree_code op, tree incr, bool negate,
+		   tree type)
+{
+  tree dtype;
+
+  if (!count)
+    return NULL_TREE;
+
+  tree ctype = TREE_TYPE (count);
+  tree itype = TREE_TYPE (incr);
+
+  if (op == NOP_EXPR && !negate)
+    return cilk_loop_convert (type, count);
+  /* Return -(unsigned) count instead of (unsigned)-count in case the negate
+     overflows.  */
+  if (op == NOP_EXPR && negate)
+    return fold_build1 (NEGATE_EXPR, type, cilk_loop_convert (type, count));
+
+  /* We are dividing two positive values or else the user has invoked
+     undefined behavior.  That means we can divide in a common narrow
+     type and widen after.  This does not work if we must negate signed
+     INCR to get a positive value because we could be negating INT_MIN.  */
+
+  if (ctype != itype || (negate && !TYPE_UNSIGNED (itype)))
+    {
+      incr = cilk_loop_convert (type, incr);
+      count = cilk_loop_convert (type, count);
+      dtype = type;
+    }
+  else
+    dtype = ctype;
+
+  if (negate)
+    incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (incr), incr);
+
+  count = fold_build2 (op, dtype, count, incr);
+  if (dtype != type)
+    count = cilk_loop_convert (type, count);
+
+  return count;
+}
+
+/* Sets *DIV_OP to the appropriate operation to divide the loop and
+   the *FORWARD tree with condition expression based on DIRECTION, INCR_SIGN
+   and EXACTLY_ONE.  */
+
+void
+cilk_calc_forward_div_op (struct cilk_for_desc *cfd, enum tree_code *div_op,
+			  tree *forward)
+{
+  switch (cfd->direction)
+    {
+    case -2:
+      *forward = boolean_false_node;
+      *div_op = CEIL_DIV_EXPR;
+      break;
+    case -1:
+      *forward = boolean_false_node;
+      *div_op = EXACT_DIV_EXPR;
+      break;
+    case 0:
+      *forward = build2 (cfd->incr_sign > 0 ? GE_EXPR : LT_EXPR,
+			 boolean_type_node, cfd->incr, integer_zero_node);
+      /* Loops with indeterminate direction use != and are always exact.  */
+      *div_op = EXACT_DIV_EXPR;
+      break;
+    case 1:
+      *forward = boolean_true_node;
+      *div_op = EXACT_DIV_EXPR;
+      break;
+    case 2:
+      *forward = boolean_true_node;
+      *div_op = CEIL_DIV_EXPR;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  if (cfd->exactly_one)
+    *div_op = NOP_EXPR;
+}
+
+/* Returns the loop-count based on the _Cilk_for loop's characteristics given
+   in *CFD.  DIV_OP indicates whether we have exact division or a CEILING
+   operation need to be performed.  COUNT_UP and COUNT_DOWN are not
+   NULL_TREE if the increment and decrement operation are done using an
+   iterator.  */
+
+tree
+cilk_compute_loop_count (struct cilk_for_desc *cfd, enum tree_code div_op,
+			 tree forward, tree count_up, tree count_down)
+{
+  /* if initial value is not given, the use the variable since it holds the
+     lower value.  */
+  tree low = cfd->lower_bound ? cfd->lower_bound : cfd->var;
+
+  /* Same logic as low for high variable.  */
+  tree high = cfd->end_var ? cfd->end_var : cfd->end_expr;
+
+  if (low == error_mark_node || high == error_mark_node)
+    {
+      gcc_assert (errorcount || sorrycount);
+      return error_mark_node;
+    }
+
+  /* If either count_up or count_down are not NULL, then it is an indication
+     that we have an interator for loop computation, so we check if
+     cfd->iterator is set to true.  */
+  if (count_up != NULL_TREE || count_down != NULL_TREE)
+    gcc_assert (cfd->iterator);
+  else
+    {
+      tree low_type = TREE_TYPE (low);
+      tree high_type = TREE_TYPE (high);
+      tree sub_type = NULL_TREE;
+
+      if (TREE_CODE (TREE_TYPE (cfd->var)) == POINTER_TYPE)
+	sub_type = ptrdiff_type_node;
+      else
+	{
+	  /* We need to compute HIGH - LOW or LOW - HIGH without overflow.  */
+	  sub_type = common_type (low_type, high_type);
+
+	  /* If subtracting two signed vars. without widening then convert them
+	     to unsigned.  */
+	  if (!TYPE_UNSIGNED (sub_type)
+	      && (TYPE_PRECISION (sub_type) == TYPE_PRECISION (low_type)
+		  || TYPE_PRECISION (sub_type) == TYPE_PRECISION (high_type)))
+	    sub_type = unsigned_type_for (sub_type);
+	}
+      if (low_type != sub_type)
+	low = convert (sub_type, low);
+      if (high_type != sub_type)
+	high = convert (sub_type, high);
+
+      if (cfd->direction <= 0)
+	count_down = fold_build2 (MINUS_EXPR, sub_type, low, high);
+      if (cfd->direction >= 0)
+	count_up = fold_build2 (MINUS_EXPR, sub_type, high, low);
+    }
+
+  /* if the loop is not exact add one before dividing.  Otherwise add 1 after
+     dividing.  Assumed that it can't overflow (meaning that loop range cannot
+     exceed the range of the loop variable or difference type).  */
+  if (cfd->inclusive && div_op == CEIL_DIV_EXPR)
+    {
+      if (count_up)
+	count_up = fold_build2 (PLUS_EXPR, TREE_TYPE (count_up), count_up,
+				build_one_cst (TREE_TYPE (count_up)));
+      if (count_down)
+	count_down = fold_build2 (PLUS_EXPR, TREE_TYPE (count_down), count_down,
+				  build_one_cst (TREE_TYPE (count_down)));
+    }
+
+  /* Serial semantics: INCR is converted to the common type of VAR and INCR then
+     the result is converted to the type of VAR.  */
+  tree incr = cfd->incr;
+  if (!cfd->iterator && TREE_CODE (TREE_TYPE (cfd->var)) != POINTER_TYPE)
+    incr = cilk_loop_convert (common_type (TREE_TYPE (cfd->var),
+					   TREE_TYPE (incr)), incr);
+
+  /* Now separately divide each count by +/- INCR yielding a value with type
+     TYPE.  */
+  count_up = cilk_divide_count (count_up, div_op, incr, cfd->incr_sign < 0,
+				cfd->count_type);
+  count_down = cilk_divide_count (count_down, div_op, incr, cfd->incr_sign > 0,
+				  cfd->count_type);
+  /* Merge forward and backward counts.  */
+  tree count = NULL_TREE;
+  if (!count_up)
+    count = count_down;
+  else if (!count_down)
+    count = count_up;
+  else
+    count = fold_build3 (COND_EXPR, cfd->count_type, forward, count_up,
+			 count_down);
+
+  /* Add one, maybe.  */
+  if (cfd->inclusive && div_op != CEIL_DIV_EXPR)
+    count = fold_build2 (PLUS_EXPR, cfd->count_type, count,
+			 build_one_cst (cfd->count_type));
+
+  return count;
+}
+
+/* Returns a GIMPLE_SEQ that contains a call to the Cilk library function and
+   the necessary temporary variables.  COUNT and FN are parameters to the
+   library function indicating the loop-count and nested function,
+   respectively.  */
+
+gimple_seq
+insert_cilk_for_nested_fn (struct cilk_for_desc *cfd, tree count, tree fn)
+{
+  /* INNER_SEQ contains evaluation of variables holding loop increment and
+     count.  These are evaluated inside the loop guard.  */
+  gimple_seq inner_seq = 0;
+  if (!TREE_CONSTANT (count))
+    {
+      count = fold_build_cleanup_point_expr (TREE_TYPE (count), count);
+      count = get_formal_tmp_var (count, &inner_seq);
+    }
+
+  if (TREE_SIDE_EFFECTS (cfd->incr))
+    cfd->incr = get_formal_tmp_var (cfd->incr, &inner_seq);
+
+  tree libfun = find_cilk_for_library_fn (cfd->count_type);
+  tree ctx = cfd->ctx_arg;
+  if (ctx)
+    {
+      if (TREE_TYPE (ctx) != ptr_type_node)
+	ctx = fold_build1 (NOP_EXPR, ptr_type_node, ctx);
+      if (!DECL_P (ctx))
+	ctx = get_formal_tmp_var (ctx, &inner_seq);
+    }
+  else
+    {
+      ctx = fold_build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (fn)), fn);
+      ctx = get_formal_tmp_var (ctx, &inner_seq);
+    }
+  fn = fold_build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (fn)), fn);
+  TREE_CONSTANT (fn) = 1;
+  fn = get_formal_tmp_var (fn, &inner_seq);
+
+  tree grain = cfd->grain;
+  tree tmv_count_type = TYPE_MAIN_VARIANT (cfd->count_type);
+  if (!grain)
+    grain = get_formal_tmp_var (build_zero_cst (cfd->count_type), &inner_seq);
+  else if (TYPE_MAIN_VARIANT (TREE_TYPE (grain)) != tmv_count_type)
+    grain = convert (cfd->count_type, grain);
+
+  tree libfun_call = build_call_expr (libfun, 4, fn, ctx, count, grain);
+  gimplify_and_add (libfun_call, &inner_seq);
+  return inner_seq;
+}
+
+/* The loop function looks like
+   
+   body (void *, unsigned long max, unsigned long min)
+   const T start = [outer_context] var;
+   T var';
+   for (unsigned long i = min; i < max; i++) {
+     var' = start + (T) i * (T) incr;
+     body (var');
+   }
+
+   COMPUTE_LOOP_VAR returns an expression for
+   var' = start + i * incr;
+   or
+   var' = start - i * decr;
+   with suitable type conversions.
+
+   If direction is known we know the sign of INCR (or else it's
+   undefined behavior) and we can work with positive unsigned
+   numbers until the last addition or subtraction.
+
+   If direction is not known then the increment and loop variable
+   are signed but the product of the loop count and increment may
+   not be representable as a signed value.
+
+   We can't do the last addition or subtraction in C without
+   a conditional operation because the conversion of unsigned
+   to signed is undefined for "negative" values of the unsigned
+   number.  For now we just pretend this isn't a problem.  We
+   may fail on targets with signed overflow.
+
+   For iterator loops we require that the difference type have
+   enough range and simply pass the value to operator+ or operator-
+   based on the static direction of the loop.  Iterator loop case is
+   handled by the function passed in a function pointer, *ITER_HANDLER.
+
+   LOOP_VAR has type COUNT_TYPE.  */
+
+tree
+cilk_compute_loop_var (struct cilk_for_desc *cfd, tree loop_var,
+		       tree lower_bound,
+		       tree (*iter_handler) (location_t, enum tree_code,
+					     tree, tree, tree))
+{
+  tree count_type = NULL_TREE;
+  if (INTEGRAL_TYPE_P (TREE_TYPE (loop_var)))
+    count_type = TREE_TYPE (loop_var);
+  else
+    count_type = cfd->count_type;
+  
+   /* Compute an expression to be added or subtracted.
+
+     We want to add or subtract LOOP_VAR * INCR.  INCR may be negative.
+     If the static direction is indeterminate we don't know that at
+     compile time.  The code to convert to unsigned and multiply does
+     the right thing in the end.  For iterator loops we don't need to
+     go to that trouble, but scalar loops can have range that can not
+     be represented in the signed loop variable.  */
+  tree scaled = NULL_TREE, incr = NULL_TREE;
+  if (integer_onep (cfd->incr))
+    scaled = loop_var;
+  else
+    {
+      incr = cilk_loop_convert (count_type, cfd->incr);
+      scaled = fold_build2 (MULT_EXPR, count_type, loop_var, incr);
+    }
+
+  enum tree_code add_op = cfd->incr_sign >= 0 ? PLUS_EXPR : MINUS_EXPR;
+  if (cfd->iterator)
+    {
+      /* Convert LOOP_VAR to T3 (difference_type) so that
+         operator+(T1, T3) is preferred over operator+(T1, count_type)
+         operator+ constructs the object if it returns by value.
+         Use operator- if the user wrote -=.  */
+      if (count_type != cfd->difference_type)
+	loop_var = convert (cfd->difference_type, scaled);
+      tree low = lower_bound ? lower_bound : cfd->var;
+      if (TREE_CODE (low) == TREE_LIST)
+	low = TREE_VALUE (low);
+      tree exp = iter_handler (cfd->loc, add_op, low, loop_var, cfd->var2);
+      return exp;
+    }
+   /* The scaled count may not be representable in the type of the
+     loop variable, e.g. if the loop range is INT_MIN+1 to INT_MAX-1
+     the range does not fit in a signed int.  The sum of the lower
+     bound and the count is representable.  Do the addition or
+     subtraction in the wider type, then narrow. */
+  tree cvt_val = cilk_loop_convert (count_type, lower_bound);
+  tree adjusted = fold_build2 (add_op, count_type, cvt_val, scaled);
+  tree exp = fold_build2 (MODIFY_EXPR, void_type_node, cfd->var2,
+			  cilk_loop_convert (cfd->var_type, adjusted));
+  return exp;
+}
+
+/* Remove NOP_EXPR, ADDR_EXPR and INDIRECT_REF wrappers from EXP and
+   return.  */
+
+tree
+cilk_tree_operand_noconv (tree exp)
+{
+  tree op = exp;
+  while (TREE_CODE (op) == NOP_EXPR
+	 || TREE_CODE (op) == ADDR_EXPR
+	 || TREE_CODE (op) == INDIRECT_REF)
+    op = TREE_OPERAND (op, 0);
+  return op;
+}
+
+/* Return the TREE_CODE for an overloaded function call, FN_CALL.  */
+
+enum tree_code
+cilk_find_code_from_call (tree fn_call)
+{
+  /* Unwrap the ADDR_EXPR layer.  */
+  tree call = TREE_OPERAND (fn_call, 0);
+  call = DECL_NAME (call);
+  const char *name = IDENTIFIER_POINTER (call);
+  char op_name[2];
+  op_name[1] = name[strlen (name) - 1];
+  if (name [strlen (name) - 2] != 'r')
+    op_name[0] = name [strlen (name) - 2];
+  else
+    op_name[0] = ' ';
+
+  if (op_name[1] == '<')
+    return LT_EXPR;
+  else if (op_name[1] == '>')
+    return GT_EXPR;
+  else if (op_name[1] == '=')
+    {
+      if (op_name[0] == '<')
+	return LE_EXPR;
+      else if (op_name[0] == '>')
+	return GE_EXPR;
+      else if (op_name[0] == '!')
+	return NE_EXPR;
+      else if (op_name[0] == '=')
+	return EQ_EXPR;
+      else if (op_name[0] == '+')
+	return PLUS_EXPR;
+      else if (op_name[0] == '-')
+	return MINUS_EXPR;
+      else
+	gcc_unreachable ();
+    }
+  else if (op_name[1] == '+' && op_name[0] == '+')
+    /* This could be post or pre increment expression, but for our case
+       it really does not matter.  */
+    return POSTINCREMENT_EXPR;
+  else if (op_name[1] == '-' && op_name[0] == '-')
+    /* Same reasoning as above for decrement expression.  */
+    return POSTDECREMENT_EXPR;
+  else
+    gcc_unreachable ();
+  return NOP_EXPR;
+}
+
+/* Extracts the initial value of the initalizer for a CILK_FOR_STMT.  This
+   information is stored in CFD->LOWER_BOUND.  */
+
+void
+cilk_set_init_info (struct cilk_for_desc *cfd)
+{
+  if (!cfd->lower_bound)
+    return;
+  else if (TREE_CODE (cfd->lower_bound) == MODIFY_EXPR
+	   || TREE_CODE (cfd->lower_bound) == INIT_EXPR)
+    {
+      tree op0 = TREE_OPERAND (cfd->lower_bound, 0);
+      tree op1 = TREE_OPERAND (cfd->lower_bound, 1);
+
+      gcc_assert (op0 == cfd->var);
+      cfd->lower_bound = op1;
+    }
+}
+
+/* Sets the CFD->INCLUSIVE, CFD->END_EXPR and CFD->DIRECTION based on the
+   characteristics of the Cilk for statement. */
+
+void
+cilk_set_inclusive_and_direction (struct cilk_for_desc *cfd)
+{
+  tree cond = cfd->cond;
+  enum tree_code cond_code = TREE_CODE (cond);
+  if (cond_code == CALL_EXPR)
+    cond_code = cilk_find_code_from_call (CALL_EXPR_FN (cond));
+  
+  switch (cond_code)
+    {
+    case NE_EXPR:
+      cfd->inclusive = false;
+      cfd->direction = 0;
+      break;
+    case GE_EXPR:
+      cfd->inclusive = true;
+      cfd->direction = -2;
+      break;
+    case GT_EXPR:
+      cfd->inclusive = false;
+      cfd->direction = -2;
+      break;
+    case LE_EXPR:
+      cfd->inclusive = true;
+      cfd->direction = 2;
+      break;
+    case LT_EXPR:
+      cfd->inclusive = false;
+      cfd->direction = 2;
+      break;
+    default:
+      /* == is not allowed.  */
+      gcc_unreachable ();
+    }
+  tree limit = NULL_TREE;
+  tree arg0, arg1;
+  
+  if (TREE_CODE (cond) == CALL_EXPR)
+    {
+      arg0 = CALL_EXPR_ARG (cond, 0);
+      arg1 = CALL_EXPR_ARG (cond, 1);
+    }
+  else
+    {
+      arg0 = TREE_OPERAND (cond, 0);
+      arg1 = TREE_OPERAND (cond, 1);
+    }
+
+  if (cilk_tree_operand_noconv (arg0) == cfd->var)
+    limit = arg1;
+  else
+    {
+      /* If we are here, then we have a case like this: 10 > ii;  */
+      limit = arg0;
+      cfd->direction = -cfd->direction;
+    }	
+     
+  cfd->end_expr = limit;
+}
+
+/* Sets CFD->ITERATOR and CFD->DIFFERENCE_TYPE based on the characteristics of
+   the _Cilk_for statement.  */
+
+void
+cilk_set_iter_difftype (struct cilk_for_desc *cfd)
+{
+  tree var_type = TREE_TYPE (cfd->var);
+  gcc_assert (var_type);
+  
+  switch (TREE_CODE (var_type))
+    {
+    case POINTER_TYPE:
+      cfd->iterator = false;
+      cfd->difference_type = ptrdiff_type_node;
+      break;
+    case ENUMERAL_TYPE:
+    case BOOLEAN_TYPE:
+    case INTEGER_TYPE:
+      cfd->iterator = false;
+      cfd->difference_type = lang_hooks.types.type_promotes_to (var_type);
+      break;
+    case RECORD_TYPE:
+    case UNION_TYPE:
+      cfd->iterator = true;
+      cfd->difference_type = NULL; /* This will be set later for C++.  */
+      break;
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Populate CFD with characteristics of the increment expression.  If
+   HANDLE_PTR_MULT is set, then increment is multiplied by the size of
+   pointer.  This is necessary for C++ but not for C.  */
+
+void
+cilk_set_incr_info (struct cilk_for_desc *cfd, bool handle_ptr_mult)
+{
+  int negate_incr = 0, incr_direction = 0;
+
+  cfd->incr = cilk_simplify_tree (cfd->incr);
+  enum tree_code inc_op = TREE_CODE (cfd->incr);
+  bool is_incr = false;
+  tree op0, op1;
+  tree incr;
+  if (inc_op == ADDR_EXPR || inc_op == CALL_EXPR || inc_op == INDIRECT_REF)
+      /* This indicates that the increment operation is overloaded.  */
+    incr = cilk_tree_operand_noconv (cfd->incr);
+  else if (inc_op == TARGET_EXPR)
+    incr = TARGET_EXPR_INITIAL (cfd->incr);
+  else
+    incr = cfd->incr;
+      
+  if (TREE_CODE (incr) == CALL_EXPR)
+    {
+      inc_op = cilk_find_code_from_call (CALL_EXPR_FN (incr));
+      if (inc_op == PLUS_EXPR || inc_op == MINUS_EXPR)
+	{
+	  op1 = CALL_EXPR_ARG (incr, 1);
+	  op0 = cilk_tree_operand_noconv (CALL_EXPR_ARG (incr, 0));
+	  inc_op = (inc_op == PLUS_EXPR ? PREINCREMENT_EXPR
+		    : PREDECREMENT_EXPR);
+	}
+      else if (inc_op == POSTINCREMENT_EXPR || inc_op == POSTDECREMENT_EXPR
+	       || inc_op == PREDECREMENT_EXPR || inc_op == PREINCREMENT_EXPR)
+	op1 = integer_one_node;
+      else
+	op1 = CALL_EXPR_ARG (incr, 0);
+    }
+  else
+    op1 = TREE_OPERAND (cfd->incr, 1);
+  
+  is_incr = (inc_op == PREINCREMENT_EXPR || inc_op == POSTINCREMENT_EXPR);
+  switch (inc_op)
+    {
+    case POSTDECREMENT_EXPR:
+    case PREDECREMENT_EXPR:
+    case PREINCREMENT_EXPR:
+    case POSTINCREMENT_EXPR:
+      negate_incr = is_incr ? false : true;
+      incr_direction = is_incr ? -1 : 1;
+      cfd->incr = op1;
+      if (!cfd->incr)
+	{
+	  tree var_type = TREE_TYPE (cfd->var);
+	  if (TREE_CODE (var_type) == POINTER_TYPE)
+	    cfd->incr = size_in_bytes (TREE_TYPE (var_type));
+	  else
+	    cfd->incr = integer_one_node;
+	}
+      cfd->exactly_one = integer_onep (cfd->incr);
+      break;
+    case MODIFY_EXPR:
+      {
+	/* In here the expressions will have the form var <MODIFY_OP> incr or
+	   op = op <OPERATION> incr.  */
+	cfd->incr = (TREE_CODE (cfd->incr) != MODIFY_EXPR ? op1
+		     : TREE_OPERAND (cfd->incr, 1));
+	enum tree_code increment_code = TREE_CODE (cfd->incr);
+	if (increment_code == PLUS_EXPR || increment_code == POINTER_PLUS_EXPR)
+	  {
+	    op0 = TREE_OPERAND (cfd->incr, 0);
+	    op1 = TREE_OPERAND (cfd->incr, 1);
+
+	    if (op0 == cfd->var || DECL_NAME (op0) == DECL_NAME (cfd->var))
+	      cfd->incr = op1;
+	    else if (op1 == cfd->var || DECL_NAME (op1) == DECL_NAME (cfd->var))
+	      cfd->incr = op0;
+	    else
+	      gcc_unreachable ();
+
+	    negate_incr = false;
+	    incr_direction = cilk_compute_incr_direction (cfd->incr);
+	  
+	    /* Adding a negative number treated as unsigned is like adding a
+	       large positive number.  */
+	    if (TYPE_UNSIGNED (cfd->difference_type) && incr_direction < 0)
+	      incr_direction = 2;
+	    cfd->exactly_one = (incr_direction == 1);
+
+	    /* Don't need to do this in POINTER_PLUS_EXPR since it already
+	       does this for you.  */
+	    if (handle_ptr_mult && increment_code != POINTER_PLUS_EXPR)
+	      {
+		tree var_type = TREE_TYPE (cfd->var);
+		if (TREE_CODE (var_type) == POINTER_TYPE)
+		  {
+		    tree size = size_in_bytes (TREE_TYPE (var_type));
+		    if (!integer_onep (size))
+		      {
+			cfd->exactly_one = 0;
+			/* For example, in the following _Cilk_for statement:
+			   _Cilk_for (int *p = a, p < b; p += (char)c)
+			   We need to do the match in a type wider than c.
+			   "build_binary_op" will do the default conversions
+			   which should be enough if SIZE is size_t.  */
+			cfd->incr = build_binary_op (cfd->loc, MULT_EXPR,
+						     cfd->incr, size, 0);
+		      }
+		  }
+	      }
+	  }
+	else if (TREE_CODE (cfd->incr) == MINUS_EXPR)
+	  {
+	    op1 = TREE_OPERAND (cfd->incr, 1);
+	    op0 = TREE_OPERAND (cfd->incr, 0);
+
+	    gcc_assert (op0 == cfd->var
+			|| DECL_NAME (op0) == DECL_NAME (cfd->var));
+	    cfd->incr = op1;
+	    negate_incr = true;
+	    incr_direction = -cilk_compute_incr_direction (cfd->incr);
+
+	    /* Subtracting a negative number is treated as adding a
+	       positive.  */
+	    if (TYPE_UNSIGNED (cfd->difference_type) && incr_direction > 0)
+	      incr_direction = -2;
+	    cfd->exactly_one = (incr_direction == -1);
+
+	    /* In C++ we need to handle the pointer arithmetic manually, but
+	       in C it seem to automatically figure this out.  */
+	    if (handle_ptr_mult)
+	      {
+		tree var_type  = TREE_TYPE (cfd->var);
+		if (TREE_CODE (var_type) == POINTER_TYPE)
+		  {
+		    tree size = size_in_bytes (TREE_TYPE (var_type));
+		    if (!integer_onep (size))
+		      {
+			cfd->exactly_one = 0;
+			/* For example, in the following _Cilk_for statement:
+			   _Cilk_for (int *p = a, p < b; p += (char)c)
+			   We need to do the match in a type wider than c.
+			   "build_binary_op" will do the default conversions
+			   which should be enough if SIZE is size_t.  */
+			cfd->incr = build_binary_op (cfd->loc, MULT_EXPR,
+						     cfd->incr, size, 0);
+		      }
+		  }
+	      }
+	  }
+	else
+	  {
+	    location_t incr_loc = EXPR_LOCATION (cfd->incr);
+	    error_at (incr_loc, "invalid loop increment operation");
+	    cfd->invalid = true;
+	    return;
+	  }
+      }
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  cfd->var_type = TREE_TYPE (cfd->var);
+  cfd->incr_sign = negate_incr ? -1 : 1;
+}
+
+/* Helper function for walk_tree. Fixes up all the continues inside a
+   _Cilk_for body.  */
+
+tree
+cilk_resolve_continue_stmts (tree *tp, int *walk_subtrees, void *data)
+{
+  tree goto_label = NULL_TREE, goto_stmt = NULL_TREE;
+  if (!tp || !*tp)
+    return NULL_TREE;
+
+  if (TREE_CODE (*tp) == CONTINUE_STMT)
+    {
+      goto_label = (tree) data;
+      goto_stmt = build1 (GOTO_EXPR, void_type_node, goto_label);
+      *tp = goto_stmt;
+      *walk_subtrees = 0;
+    }
+  else if (TREE_CODE (*tp) == FOR_STMT || TREE_CODE (*tp) == WHILE_STMT
+           || TREE_CODE (*tp) == DO_STMT || TREE_CODE (*tp) == CILK_FOR_STMT)
+      /* Inside these statements, the continue goes to a different place not
+         end of cilk_for. You do not want to go into these trees because we
+         will resolve those later.  */
+    *walk_subtrees = 0;
+
+  return NULL_TREE;
+}
diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in
index d79fc4f..baa8af2 100644
--- a/gcc/c/Make-lang.in
+++ b/gcc/c/Make-lang.in
@@ -51,7 +51,7 @@ CFLAGS-c/gccspec.o += $(DRIVER_DEFINES)
 # Language-specific object files for C and Objective C.
 C_AND_OBJC_OBJS = attribs.o c/c-errors.o c/c-decl.o c/c-typeck.o \
   c/c-convert.o c/c-aux-info.o c/c-objc-common.o c/c-parser.o \
-  c/c-array-notation.o $(C_COMMON_OBJS) $(C_TARGET_OBJS)
+  c/c-array-notation.o c/c-cilk.o $(C_COMMON_OBJS) $(C_TARGET_OBJS)
 
 # Language-specific object files for C.
 C_OBJS = c/c-lang.o c-family/stub-objc.o $(C_AND_OBJC_OBJS)
diff --git a/gcc/c/c-cilk.c b/gcc/c/c-cilk.c
new file mode 100755
index 0000000..982b0de
--- /dev/null
+++ b/gcc/c/c-cilk.c
@@ -0,0 +1,359 @@
+/* This file is part of the Intel (R) Cilk (TM) Plus support
+   This file contains the functions required to handle _Cilk_for
+   for the C language.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Balaji V. Iyer <balaji.v.iyer@intel.com>,
+   Intel Corporation
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "c-tree.h"
+#include "langhooks.h"
+#include "gimple.h"
+#include "tree-iterator.h"
+#include "tree-inline.h"
+#include "c-family/c-common.h"
+#include "toplev.h"
+#include "cgraph.h"
+#include "diagnostic.h"
+#include "cilk.h"
+#include "gimplify.h"
+
+/* Get a block for the CILK_FOR_STMT, CFOR.  */
+
+static tree
+block_cilk_for_loop (tree cfor)
+{
+  tree block = tree_block (cfor);
+  if (block)
+    return block;
+  return DECL_INITIAL (current_function_decl);
+}
+
+/* Create or discover the variable to be used in the loop termination
+   condition.  Return true if the cfd->end_var should be used in the
+   guard test around the runtime call.  Otherwise the guard test uses
+   the complex expression, which in C++ may initialize the variable.
+
+   For example, if END_EXPR is
+
+   (target_expr limit (call constructor ...))
+
+   the variable limit is not initialized until the target_expr is
+   evaluated.  */
+
+static bool
+cilk_for_end (struct cilk_for_desc *cfd, gimple_seq *pre_p)
+{
+  tree end = cfd->end_expr;
+  if (TREE_SIDE_EFFECTS (end))
+    {
+      enum tree_code ecode = TREE_CODE (end);
+      if (ecode == INIT_EXPR || ecode == MODIFY_EXPR || ecode == TARGET_EXPR)
+        {
+          cfd->end_var = TREE_OPERAND (end, 0);
+          return false;
+        }
+      else
+        {
+          /* Copy the result of evaluating the expression into a variable.
+             The compiler will probably crash if there's anything
+             complicated in it -- a complicated value needs to go through
+             the other branch of this IF using an explicit temporary.  */
+          cfd->end_var = get_formal_tmp_var (end, pre_p);
+          return true;
+        }
+    }
+  cfd->end_var = end;
+  return false;
+}
+
+/* Handler for iterator to compute the loop variable.  LOW indicates the
+   starting point and LOOP_VAR is the induction variable, and VAR2 is the
+   original induction variable in the Cilk_for.  Returns an expression
+   (or a STATEMENT_LIST of expressions).  */
+
+static tree
+compute_loop_var_c_iter_hdl (location_t loc, enum tree_code add_op, tree low,
+			     tree loop_var, tree var2)
+{
+  tree exp = fold_build2 (add_op, TREE_TYPE (loop_var), low, loop_var);
+  gcc_assert (exp != error_mark_node);
+
+  exp = build_modify_expr (loc, var2, TREE_TYPE (var2), INIT_EXPR, loc, exp,
+			   TREE_TYPE (exp));
+  gcc_assert (exp != error_mark_node);
+  return exp;
+}
+
+
+/* Creates a body of the _Cilk_for wrapper function with the information
+   in *CFD.  */
+
+static tree
+create_cilk_for_body (struct cilk_for_desc *cfd)
+{
+  declare_cilk_for_parms (cfd);
+  cfd->wd.fntype = build_function_type (void_type_node, cfd->wd.argtypes);
+  tree fndecl = cilk_create_cilk_helper_decl (&cfd->wd);
+
+  tree outer = current_function_decl;
+  push_struct_function (fndecl);
+  current_function_decl = fndecl;
+
+  declare_cilk_for_vars (cfd, fndecl);
+
+  tree body = push_stmt_list ();
+  tree mod_expr = NULL_TREE, loop_var = NULL_TREE;
+  tree lower_bound = cfd->lower_bound;
+  if (!lower_bound)
+    {
+      lower_bound = cfd->var;
+      tree hack = build_decl (EXPR_LOCATION (cfd->var), VAR_DECL, NULL_TREE,
+			      TREE_TYPE (lower_bound));
+      DECL_CONTEXT (hack) = DECL_CONTEXT (lower_bound);
+      DECL_NAME (hack) = DECL_NAME (lower_bound);
+      *pointer_map_insert (cfd->wd.decl_map, hack) = lower_bound;
+      lower_bound = hack;
+    }
+  if (INTEGRAL_TYPE_P (cfd->var_type))
+    {
+      tree new_min_parm = fold_build1 (CONVERT_EXPR, cfd->var_type,
+				       cfd->min_parm);
+      loop_var = create_tmp_var (cfd->var_type, NULL);
+      location_t loc = EXPR_LOCATION (cfd->var);
+      mod_expr = build_modify_expr (loc, loop_var, cfd->var_type, NOP_EXPR,
+				    loc, new_min_parm, cfd->var_type);
+    }
+  else
+    {
+      loop_var = create_tmp_var (TREE_TYPE (cfd->min_parm), NULL);
+      mod_expr = fold_build2 (INIT_EXPR, void_type_node, loop_var,
+			      cfd->min_parm);
+    }
+  add_stmt (mod_expr);
+  tree loop_body = NULL_TREE;
+  tree new_max_parm = NULL_TREE;
+
+  if (!INTEGRAL_TYPE_P (cfd->var_type))
+    new_max_parm = cfd->max_parm;
+  else
+    new_max_parm = fold_build1 (CONVERT_EXPR, cfd->var_type, cfd->max_parm);
+
+  tree end_comp = cilk_compute_loop_var (cfd, loop_var, lower_bound,
+					 compute_loop_var_c_iter_hdl);
+  append_to_statement_list (end_comp, &loop_body);
+  append_to_statement_list (cfd->body, &loop_body);
+  loop_body = fold_build_cleanup_point_expr (void_type_node, loop_body);
+  DECL_SEEN_IN_BIND_EXPR_P (cfd->var2) = 1;
+
+  cilk_outline_body (fndecl, &loop_body, &cfd->wd, NULL);
+  
+  tree loop = push_stmt_list ();
+  /* Now create a loop with c_finish_loop.  */
+  tree incr = fold_build2 (PLUS_EXPR, TREE_TYPE (loop_var), loop_var,
+			   build_one_cst (TREE_TYPE (loop_var)));
+  incr = fold_build2 (MODIFY_EXPR, void_type_node, loop_var, incr);
+  tree cond = fold_build2 (LT_EXPR, boolean_type_node, loop_var, new_max_parm);
+  c_finish_loop (EXPR_LOCATION (cfd->var), cond, incr, loop_body, NULL_TREE,
+		 NULL_TREE, false);
+  loop = pop_stmt_list (loop);
+  add_stmt (loop);
+  body = pop_stmt_list (body);
+
+  tree block = DECL_INITIAL (fndecl);
+  BLOCK_VARS (block) = loop_var;
+  body = build3 (BIND_EXPR, void_type_node, loop_var, body, block);
+  TREE_CHAIN (loop_var) = cfd->var2;
+  if (cilk_detect_spawn_and_unwrap (&body))
+    lang_hooks.cilkplus.install_body_with_frame_cleanup (fndecl, body);
+  else
+    DECL_SAVED_TREE (fndecl) = body;
+
+  pop_cfun ();
+  current_function_decl = outer;
+  return fndecl;
+}
+
+/* Creates a wrapper function for the body of a _Cilk_for statement with the
+   information stored in *CFD.  */
+
+static tree
+create_cilk_for_wrapper (struct cilk_for_desc *cfd)
+{
+  tree old_cfd = current_function_decl;
+  tree incr = cfd->incr;
+  tree var = cfd->var;
+
+  if (POINTER_TYPE_P (TREE_TYPE (var)))
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_WRITE);
+  else
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_READ);
+
+  cilk_extract_free_variables (incr, &cfd->wd, ADD_READ);
+
+  /* Map the loop variable to integer_minus_one_node if we won't really
+     be passing it to the loop body and integer_zero_node otherwise.
+
+     If the map ends up integer_one_node then somebody wrote to the loop
+     variable and that's a user error.
+     The correct map will be installed in declare_for_loop_variables. */
+  *pointer_map_insert (cfd->wd.decl_map, var) =
+    (void *) (cfd->lower_bound ? integer_minus_one_node : integer_zero_node);
+  
+  /* Note that variables are not extracted from the loop condition
+     and increment.  They are evaluated, to the extent they are
+     evaluated, in the context containing the for loop.  */
+  cilk_extract_free_variables (cfd->body, &cfd->wd, ADD_READ);
+
+  tree fn = create_cilk_for_body (cfd);
+  DECL_UNINLINABLE (fn) = 1;
+  current_function_decl = old_cfd;
+  set_cfun (DECL_STRUCT_FUNCTION (current_function_decl));
+  cfun->is_cilk_function = 1;
+
+  /* Add the new function to the cgraph list.  */
+  cilk_call_graph_add_fn (fn);
+  return fn;
+}
+
+/* Helper function for gimplify_cilk_for.  *CFD contains all the relevant
+   information extracted from a _Cilk_for statement passed into the parent
+   function gimplify_cilk_for.  */
+   
+void
+gimplify_cilk_for_1 (struct cilk_for_desc *cfd, gimple_seq *pre_p)
+{ 
+  /* We don't have to evaluate INCR only once, but we do have
+     to evaluate it no more times than in the serial loop.
+     The naive method evaluates INCR exactly that many times
+     except if the static loop direction is indeterminate.
+
+     Storing the increment in a variable is thus mandatory
+     if cfd.direction == 0.  It is an optimization otherwise
+     and there seems no harm and some benefit in doing it.
+
+     The evaluation is on the inner statement list.  The
+     increment can not be referenced prior to the loop test.  */
+  if (TREE_SIDE_EFFECTS (cfd->incr))
+    sorry ("_Cilk_for increment with side effects");
+
+  tree cond = cfd->cond;
+  tree op0 = TREE_OPERAND (cond, 0);
+  tree op1 = TREE_OPERAND (cond, 1);
+  if (!cilk_for_end (cfd, pre_p) && cfd->end_var != cfd->end_expr)
+    {
+      if (op1 == cfd->end_expr)
+	op1 = cfd->end_var;
+      else
+	op0 = cfd->end_var;
+    }
+  cond = fold_build2 (TREE_CODE (cond), boolean_type_node, op0, op1);
+
+  tree forward = NULL_TREE;
+  
+  /* This is set to NOP_EXPR to have an initial value since we are passing in
+     an address to the function below.  */
+  enum tree_code div_op = NOP_EXPR;
+
+  cilk_calc_forward_div_op (cfd, &div_op, &forward);
+
+  tree count = cilk_compute_loop_count (cfd, div_op, forward, NULL_TREE,
+					NULL_TREE);
+  tree fn = create_cilk_for_wrapper (cfd);
+
+  /* Set condition correctly, so that the function below can use it.  */
+  cfd->cond = cond;
+  gimple_seq inner_seq = insert_cilk_for_nested_fn (cfd, count, fn);
+  gimple_seq_add_seq (pre_p, inner_seq);
+}
+
+/* Extracts all the relevant information from CFOR, a CILK_FOR_STMT tree and
+   stores them in CFD structure.  */
+
+static void
+c_extract_cilk_for_fields (struct cilk_for_desc *cfd, tree cfor)
+{
+  cfd->var = CILK_FOR_VAR (cfor);
+  cfd->cond = CILK_FOR_COND (cfor);
+  cfd->lower_bound = CILK_FOR_INIT (cfor);
+  cfd->incr = CILK_FOR_EXPR (cfor);
+  cfd->loc = EXPR_LOCATION (cfor);
+  cfd->body = CILK_FOR_BODY (cfor);
+  cfd->grain = CILK_FOR_GRAIN (cfor);
+  cfd->invalid = false;
+
+  /* This function shouldn't be setting these two variables.  */
+  cfd->ctx_arg = NULL_TREE;
+  cfd->count = NULL_TREE;
+
+  if (TREE_CODE (cfd->lower_bound) == MODIFY_EXPR
+      || TREE_CODE (cfd->lower_bound) == INIT_EXPR)
+    {
+      tree op0 = TREE_OPERAND (cfd->lower_bound, 0); 
+      tree op1 = TREE_OPERAND (cfd->lower_bound, 1);
+
+      gcc_assert (op0 == cfd->var);
+      cfd->lower_bound = op1;
+    }
+
+  cilk_set_inclusive_and_direction (cfd);  
+  cilk_set_iter_difftype (cfd);
+  
+  /* Difference type cannot be NULL_TREE here for C.  */
+  cfd->count_type = cilk_check_loop_difference_type (cfd->difference_type);
+  if (cfd->count_type == NULL_TREE)
+    {
+      cfd->invalid = true;
+      return;
+    }
+
+  cilk_set_incr_info (cfd, false);
+}  
+
+/* Main entry-point function to gimplify a cilk_for statement.  *EXPR_P should
+   be a CILK_FOR_STMT tree.  */
+
+int
+c_gimplify_cilk_for (tree *expr_p, gimple_seq *pre_p,
+		     gimple_seq *post_p ATTRIBUTE_UNUSED)
+{
+  struct cilk_for_desc cfd;
+  tree cfor_expr = *expr_p;
+  
+  cfun->is_cilk_function = 1;
+  cilk_init_cfd (&cfd);
+  cfd.wd.block = block_cilk_for_loop (cfor_expr);
+  
+  c_extract_cilk_for_fields (&cfd, cfor_expr);
+  
+  *expr_p = NULL_TREE;
+  if (cfd.invalid)
+    return GS_ERROR;
+
+  tree var = CILK_FOR_VAR (cfor_expr);
+  tree init = CILK_FOR_INIT (cfor_expr);
+  tree init_expr = fold_build2 (MODIFY_EXPR, void_type_node, var, init);
+  
+  gimplify_and_add (init_expr, pre_p);
+  gimplify_cilk_for_1 (&cfd, pre_p);
+  return GS_ALL_DONE;
+}
diff --git a/gcc/c/c-objc-common.h b/gcc/c/c-objc-common.h
index 6ae7b3e..ca7fa4a 100644
--- a/gcc/c/c-objc-common.h
+++ b/gcc/c/c-objc-common.h
@@ -114,4 +114,7 @@ along with GCC; see the file COPYING3.  If not see
 #undef  LANG_HOOKS_CILKPLUS_DETECT_SPAWN_AND_UNWRAP
 #define LANG_HOOKS_CILKPLUS_DETECT_SPAWN_AND_UNWRAP  \
   cilk_detect_spawn_and_unwrap
+
+#undef  LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR
+#define LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR c_gimplify_cilk_for
 #endif /* GCC_C_OBJC_COMMON */
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 6f03402..b7fb11a 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1165,6 +1165,8 @@ static void c_parser_switch_statement (c_parser *);
 static void c_parser_while_statement (c_parser *, bool);
 static void c_parser_do_statement (c_parser *, bool);
 static void c_parser_for_statement (c_parser *, bool);
+static void c_parser_cilk_for_statement (c_parser *, tree);
+static void c_parser_cilk_grainsize (c_parser *parser);
 static tree c_parser_asm_statement (c_parser *);
 static tree c_parser_asm_operands (c_parser *);
 static tree c_parser_asm_goto_operands (c_parser *);
@@ -4771,6 +4773,12 @@ c_parser_statement_after_labels (c_parser *parser)
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
 	  break;
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    error_at (loc, "-fcilkplus must be enabled to use %<_Cilk_for%>");
+	  else
+	    c_parser_cilk_for_statement (parser, NULL_TREE);
+	  break;
 	case RID_CILK_SYNC:
 	  c_parser_consume_token (parser);
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
@@ -9381,6 +9389,24 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       c_parser_cilk_simd (parser);
       return false;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (!flag_enable_cilkplus)
+	{
+	  warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not"
+		   " enabled");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      if (context == pragma_external)
+	{
+	  error_at (c_parser_peek_token (parser)->location,
+		    "%<#pragma grainsize%> must be inside a function");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      c_parser_cilk_grainsize (parser);
+      return false;
+
     default:
       if (id < PRAGMA_FIRST_EXTERNAL)
 	{
@@ -13558,6 +13584,174 @@ c_parser_cilk_all_clauses (c_parser *parser)
   return c_finish_cilk_clauses (clauses);
 }
 
+/* Parses a _Cilk_for statement. 
+
+   GRAIN is used to pass the Grain value from <#pragma cilk grainsize>.  If
+   this field is NULL then runtime will find an appropriate grain size
+   (highly recommended!).  */
+  
+static void
+c_parser_cilk_for_statement (c_parser *parser, tree grain)
+{
+  tree init, decl, stmt;
+  tree  body;
+  bool fail = false;
+
+  if (!c_parser_next_token_is_keyword (parser, RID_CILK_FOR))
+    { 
+      c_parser_error (parser, "_Cilk_for statement expected");
+      return;
+    }
+
+  c_parser_consume_token (parser);
+  location_t loc = c_parser_peek_token (parser)->location;
+
+  tree block = c_begin_compound_stmt (true);
+
+  if (!c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
+    {
+      add_stmt (c_end_compound_stmt (loc, block, true));
+      return;
+    }
+
+  /* Parse the initialization declaration.  */
+  if (c_parser_next_tokens_start_declaration (parser))
+    {
+      vec<c_token> none_clauses = vNULL;
+      c_token eof_token;
+      memset (&eof_token, 0, sizeof (eof_token));
+      eof_token.type = CPP_EOF;
+      none_clauses.safe_push (eof_token);
+      none_clauses.safe_push (eof_token);
+      c_parser_declaration_or_fndef (parser, true, false, false,
+				     false, false, NULL, none_clauses);
+      decl = check_for_loop_decls (loc, flag_isoc99);
+      if (decl == NULL)
+	goto error_init;
+      if (DECL_INITIAL (decl) == error_mark_node)
+	decl = error_mark_node;
+      init = decl;
+    }
+  else if (c_parser_next_token_is (parser, CPP_NAME)
+	   && c_parser_peek_2nd_token (parser)->type == CPP_EQ)
+    {
+      struct c_expr decl_exp;
+      struct c_expr init_exp;
+      location_t init_loc;
+
+      decl_exp = c_parser_postfix_expression (parser);
+      decl = decl_exp.value;
+
+      c_parser_require (parser, CPP_EQ, "expected %<=%>");
+
+      init_loc = c_parser_peek_token (parser)->location;
+      init_exp = c_parser_expr_no_commas (parser, NULL);
+      init_exp = default_function_array_read_conversion (init_loc,
+							 init_exp);
+      init = build_modify_expr (init_loc, decl, decl_exp.original_type,
+				NOP_EXPR, init_loc, init_exp.value,
+				init_exp.original_type);
+      init = c_process_expr_stmt (init_loc, init);
+
+      c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
+    }
+  else
+    {
+    error_init:
+      c_parser_error (parser, "expected induction variable initialization");
+      c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
+      return;
+    }
+
+  /* Parse the loop condition.  */
+  tree cond = NULL_TREE;
+  if (c_parser_next_token_is_not (parser, CPP_SEMICOLON))
+    {
+      location_t cond_loc = c_parser_peek_token (parser)->location;
+      struct c_expr cond_expr = c_parser_binary_expression (parser, NULL, NULL);
+
+      cond = cond_expr.value;
+      cond = c_objc_common_truthvalue_conversion (cond_loc, cond);
+      cond = c_fully_fold (cond, false, NULL);
+    }
+  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
+
+  /* Parse the increment expression.  */
+  tree incr = NULL_TREE;
+  if (c_parser_next_token_is_not (parser, CPP_CLOSE_PAREN))
+    {
+      location_t incr_loc = c_parser_peek_token (parser)->location;
+      incr = c_process_expr_stmt (incr_loc,
+				  c_parser_expression (parser).value);
+    }
+  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
+
+  if (decl == NULL || decl == error_mark_node || init == error_mark_node)
+    fail = true;
+
+  tree save_break = c_break_label;
+  /* Magic number to inform c_finish_bc_stmt() that we are within a
+     Cilk for construct.  */
+  c_break_label = build_int_cst (size_type_node, 2);
+
+  tree save_cont = c_cont_label;
+  c_cont_label = NULL_TREE;
+  body = c_parser_c99_block_statement (parser);
+  c_break_label = save_break;
+  c_cont_label = save_cont;
+
+  if (!fail) 
+    c_finish_cilk_for_loop (loc, decl, init, cond, incr, body, grain, false, 
+			    false);
+
+  stmt = c_end_compound_stmt (loc, block, true);
+  add_stmt (stmt);
+  c_break_label = save_break;
+  c_cont_label = save_cont;
+}
+
+/* This function helps parse the grainsize pragma for a _Cilk_for statement. 
+   Here is the correct syntax of this pragma: 
+	    #pragma cilk grainsize = <EXP> 
+ */
+
+static void
+c_parser_cilk_grainsize (c_parser *parser)
+{
+  extern tree convert_to_integer (tree, tree);
+  
+  c_parser_consume_pragma (parser);
+
+  if (c_parser_require (parser, CPP_EQ, "expected %<=%>") != 0)
+    {
+      struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL);
+      if (g_expr.value && TREE_CODE (g_expr.value) == C_MAYBE_CONST_EXPR)
+	{
+	  error_at (input_location, "cannot convert grain to long integer.\n");
+	  c_parser_skip_to_pragma_eol (parser);
+	}   
+      else if (g_expr.value && g_expr.value != error_mark_node)
+	{
+	  c_parser_skip_to_pragma_eol (parser);
+	  c_token *token = c_parser_peek_token (parser);
+	  if (token && token->type == CPP_KEYWORD
+	      && token->keyword == RID_CILK_FOR)
+	    {
+	      tree grain = convert_to_integer (long_integer_type_node,
+					       g_expr.value);
+	      if (grain && grain != error_mark_node) 
+		c_parser_cilk_for_statement (parser, grain);
+	    }
+	  else
+	    warning (0, "grainsize pragma is not followed by %<_Cilk_for%>");
+	}
+      else
+	c_parser_skip_to_pragma_eol (parser);
+    }
+  else
+    c_parser_skip_to_pragma_eol (parser);
+}
+
 /* Main entry point for parsing Cilk Plus <#pragma simd> for
    loops.  */
 
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index c4dfc3b..55f2a7a 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -684,4 +684,7 @@ extern tree c_check_omp_declare_reduction_r (tree *, int *, void *);
 extern void pedwarn_c90 (location_t, int opt, const char *, ...) ATTRIBUTE_GCC_DIAG(3,4);
 extern void pedwarn_c99 (location_t, int opt, const char *, ...) ATTRIBUTE_GCC_DIAG(3,4);
 
+/* In c-cilk.c */
+extern int c_gimplify_cilk_for (tree *, gimple_seq *, gimple_seq *);
+
 #endif /* ! GCC_C_TREE_H */
diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def
index 8634194..a279a93 100644
--- a/gcc/cilk-builtins.def
+++ b/gcc/cilk-builtins.def
@@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64")
\ No newline at end of file
diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c
index 8e070a3..c5308f4 100644
--- a/gcc/cilk-common.c
+++ b/gcc/cilk-common.c
@@ -70,7 +70,6 @@ cilk_arrow (tree frame_ptr, int field_number, bool volatil)
 		   field_number, volatil);
 }
 
-
 /* This function will add FIELD of type TYPE to a defined built-in 
    structure.  *NAME is the name of the field to be added.  */
 
@@ -105,6 +104,27 @@ install_builtin (const char *name, tree fntype, enum built_in_function code,
   return fndecl;
 }
 
+/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
+
+static tree
+cilk_declare_looper (const char *name, tree type, enum built_in_function code)
+{
+  tree cb, ft, fn;
+
+  cb = build_function_type_list (void_type_node,
+                                 ptr_type_node, type, type,
+                                 NULL_TREE);
+  cb = build_pointer_type (cb);
+  ft = build_function_type_list (void_type_node,
+                                 cb, ptr_type_node, type,
+                                 integer_type_node, NULL_TREE);
+  fn = install_builtin (name, ft, code, false);
+  TREE_NOTHROW (fn) = 0;
+
+  return fn;
+}
+
+
 /* Creates and initializes all the built-in Cilk keywords functions and three
    structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker.
    Detailed information about __cilkrts_stack_frame and
@@ -268,6 +288,17 @@ cilk_init_builtins (void)
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 
 					 fptr_fun, BUILT_IN_CILK_SAVE_FP,
 					 false);
+
+  /* __cilkrts_cilk_for_32 (...);  */
+  cilk_for_32_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_32",
+					    unsigned_intSI_type_node,
+				       	    BUILT_IN_CILK_FOR_32);
+
+
+  /* __cilkrts_cilk_for_64 (...);  */
+  cilk_for_64_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_64",
+					    unsigned_intDI_type_node,
+					    BUILT_IN_CILK_FOR_64);
 }
 
 /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR.  */
diff --git a/gcc/cilk.h b/gcc/cilk.h
index 99b4d78..acdfb9c 100644
--- a/gcc/cilk.h
+++ b/gcc/cilk.h
@@ -40,6 +40,8 @@ enum cilk_tree_index  {
   CILK_TI_F_POP,                      /* __cilkrts_pop_frame (...).  */
   CILK_TI_F_RETHROW,                  /* __cilkrts_rethrow (...).  */
   CILK_TI_F_SAVE_FP,                  /* __cilkrts_save_fp_ctrl_state (...).  */
+  CILK_TI_F_LOOP_32,                  /* __cilkrts_cilk_for_32 (...).  */
+  CILK_TI_F_LOOP_64,                  /* __cilkrts_cilk_for_64 (...).  */
   /* __cilkrts_stack_frame struct fields.  */
   CILK_TI_FRAME_FLAGS,                /* stack_frame->flags.  */
   CILK_TI_FRAME_PARENT,               /* stack_frame->parent.  */
@@ -65,6 +67,140 @@ enum cilk_tree_index  {
   CILK_TI_MAX
 };
 
+enum add_variable_type {
+    /* Reference to previously-defined variable.  */
+    ADD_READ,
+    /* Definition of a new variable in inner-scope.  */
+    ADD_BIND,
+    /* Write to possibly previously-defined variable.  */
+    ADD_WRITE
+};
+
+enum cilk_block_type {
+    /* Indicates a _Cilk_spawn block.  30 was an arbitary number picked for 
+       ease of debugging.  */
+    CILK_BLOCK_SPAWN = 30,
+    /* Indicates _Cilk_for statement block.  */
+    CILK_BLOCK_FOR
+};
+
+struct wrapper_data
+{
+  /* Kind of function to be created.  */
+  enum cilk_block_type type;
+  /* Signature of helper function.  */
+  tree fntype;
+  /* Containing function.  */
+  tree context;
+  /* Disposition of all variables in the inner statement.  */
+  struct pointer_map_t *decl_map;
+  /* True if this function needs a static chain.  */
+  bool nested;
+  /* Arguments to be passed to wrapper function, currently a list.  */
+  tree arglist;
+  /* Argument types, a list.  */
+  tree argtypes;
+  /* Incoming parameters.  */
+  tree parms;
+  /* Outer BLOCK object.  */
+  tree block;
+};
+
+/* This structure holds all the important information necessary for decomposing
+   a cilk_for statement.  */
+
+struct cilk_for_desc
+{
+  /* Location of the _Cilk_for statement.  */
+  location_t loc;
+
+  /* Information about the wrapper/nested function for _Cilk_for.  */ 
+  struct wrapper_data wd;
+
+  /* Does the loop body trigger undefined behavior at runtime?  */
+  bool invalid;
+
+  /* Indicates if the parent function is a nested function (C++ only).  */
+  bool nested_ok;
+
+  /* Is the loop control variable a RECORD_TYPE?  */
+  bool iterator;
+
+  /* Does the loop range include its upper bound?  */
+  bool inclusive;
+
+  /* Does the loop control variable, after converting pointer to
+     machine address and taking into account sizeof pointed to
+     type, increment or decrement by (plus or minus) one?  */
+  bool exactly_one;
+
+  /* Is the increment stored in this structure to be added (+1)
+     or subtracted (-1)? */
+  signed char incr_sign;
+
+  /* Direction is +/-1 if the increment is known to be exactly one
+     in the user-visible units, +/-2 if the sign is known but the
+     value is not known to be one, and zero if the sign is not known
+     at compile time.  */
+  signed char direction;
+
+  /* Loop upper bound.  END_EXPR is the tree for the loop bound.
+     END_VAR is either END_EXPR or a VAR_DECL holding the stabilized
+     value, if computation of the value has side effects.  */
+  tree end_expr, end_var;
+
+  /* The originally-declared loop control variable.  */
+  tree var;
+
+  /* Lower bound of the loop if it is constant enough.
+     With a constant lower bound the loop body may not
+     need to use the static chain to compute the iterator
+     value.  */
+  tree lower_bound;
+
+  /* Several types:
+
+     The declared type of the loop control variable,
+     T1 in the cilk_for spec.
+
+     The type of the loop count and argument to loop body, currently
+     always unsigned long.  (If pointers are wider, we will need a
+     pointer-sized type.)
+
+     The static type of end, T2 in the cilk_for spec.
+
+     The difference type T3 of T1-T1, which is the same as T1 for
+     integral types.  The difference type may not be wider than the
+     count type.  For integers subtraction is done in count_type
+     in case difference_type can't hold the range.
+
+     If integral, the type of the increment is known to be no wider
+     than var_type otherwise the truncation in
+     VAR = (shorter)((longer)VAR + INCR)
+     would have been rejected.  */
+  tree var_type, count_type, difference_type;
+  tree incr;
+  tree cond;
+  /* The originally-declared body of the loop.  */
+  tree body;
+
+  /* Grainsize set by the user.  */
+  tree grain;
+
+  /* Context argument to generated function, if not (fdesc fn 1).  */
+  tree ctx_arg;
+
+  /* The number of loop iterations, in case the generated function
+     needs to know.  */
+  tree count;
+
+  /* Variables of the generated function.  */
+  tree ctx_parm, min_parm, max_parm;
+
+  /* Copy of the induction variable, but at different function context.  */
+  tree var2;
+};
+
 extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 
 #define cilk_worker_fndecl            cilk_trees[CILK_TI_F_WORKER]
@@ -77,6 +213,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 #define cilk_rethrow_fndecl           cilk_trees[CILK_TI_F_RETHROW]
 #define cilk_pop_fndecl               cilk_trees[CILK_TI_F_POP]
 #define cilk_save_fp_fndecl           cilk_trees[CILK_TI_F_SAVE_FP]
+#define cilk_for_32_fndecl            cilk_trees[CILK_TI_F_LOOP_32]
+#define cilk_for_64_fndecl            cilk_trees[CILK_TI_F_LOOP_64]
 
 #define cilk_worker_type_fndecl       cilk_trees[CILK_TI_WORKER_TYPE]
 #define cilk_frame_type_decl          cilk_trees[CILK_TI_FRAME_TYPE]
@@ -90,6 +228,7 @@ extern tree cilk_dot (tree, int, bool);
 extern void cilk_init_builtins (void);
 extern void gimplify_cilk_sync (tree *, gimple_seq *);
 extern tree cilk_call_setjmp (tree);
+
 /* Returns true if Cilk Plus is enabled and if F->cilk_frame_decl is not
    NULL_TREE.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index bb50e25..f8b2b9b 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -7246,6 +7246,11 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	    }
 	  break;
 
+	case CILK_FOR_STMT:
+	  ret = (enum gimplify_status)
+	    lang_hooks.cilkplus.gimplify_cilk_for (expr_p, pre_p, post_p);
+	  break;
+
 	case CILK_SPAWN_STMT:
 	  gcc_assert 
 	    (fn_contains_cilk_spawn_p (cfun) 
diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index 411cf74..e9c46b2 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -219,11 +219,13 @@ extern bool lhd_cilk_detect_spawn (tree *);
 #define LANG_HOOKS_CILKPLUS_DETECT_SPAWN_AND_UNWRAP lhd_cilk_detect_spawn
 #define LANG_HOOKS_CILKPLUS_FRAME_CLEANUP lhd_install_body_with_frame_cleanup
 #define LANG_HOOKS_CILKPLUS_GIMPLIFY_SPAWN lhd_gimplify_expr
+#define LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR lhd_gimplify_expr
 
 #define LANG_HOOKS_CILKPLUS {			\
   LANG_HOOKS_CILKPLUS_DETECT_SPAWN_AND_UNWRAP,	\
   LANG_HOOKS_CILKPLUS_FRAME_CLEANUP,		\
-  LANG_HOOKS_CILKPLUS_GIMPLIFY_SPAWN            \
+  LANG_HOOKS_CILKPLUS_GIMPLIFY_SPAWN,           \
+  LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR         \
 }
 
 #define LANG_HOOKS_DECLS { \
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index 9539e7d..fe8e440 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -154,6 +154,11 @@ struct lang_hooks_for_cilkplus
      status, but as mentioned in a previous comment, we can't see that type 
      here, so just return an int.  */
   int (*gimplify_cilk_spawn) (tree *, gimple_seq *, gimple_seq *);
+
+  /* Function to gimplify a _Cilk_for statement.  Returns enum gimplify
+     status, but as mentioned in a previous comment, we can't see that type 
+     here, so just return an int.  */
+  int (*gimplify_cilk_for) (tree *, gimple_seq *, gimple_seq *);
 };
 
 /* Language hooks related to decls and the symbol table.  */
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk-for.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk-for.c
new file mode 100644
index 0000000..caab055
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk-for.c
@@ -0,0 +1,34 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int main(int argc, char **argv)
+{
+  char Array1[26], Array2[26];
+  char Array1_Serial[26], Array2_Serial[26];
+  
+  int ii = 0, error = 0;
+  for (ii = 0; ii < 26; ii++)  
+    { 
+      Array1[ii] = 'A'+ii;
+      Array1_Serial[ii] = 'A'+ii;
+    }
+  for (ii = 0; ii < 26; ii++)
+    {
+      Array2[ii] = 'a'+ii;
+      Array2_Serial[ii] = 'a'+ii;
+    }
+  ii = 0;
+  _Cilk_for (ii = 0 ; ii < 26; ii++) 
+    Array1[ii] = Array2[ii];
+
+  for (ii = 0; ii < 26; ii++)
+    Array1_Serial[ii] = Array2_Serial[ii];
+
+  for (ii = 0; ii < 26; ii++)  {
+    if (Array1_Serial[ii] != Array1[ii])  { 
+	error = 1; 
+    }
+  }
+  return error;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_decr.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_decr.c
new file mode 100644
index 0000000..e45b557
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_decr.c
@@ -0,0 +1,44 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus -w" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define ARRAY_SIZE 1000
+
+int a[ARRAY_SIZE];
+
+int main(void)
+{
+  int i= 0;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    a[i] = 0;
+  
+  _Cilk_for (i = (ARRAY_SIZE-1); i >= 0; i--) 
+    a[i] = i;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    if (a[i] != i)
+      return 1;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    a[i] = 0;
+  
+  _Cilk_for (i = (ARRAY_SIZE-1); i >= 0; i -= 1) 
+    a[i] = i;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    if (a[i] != i)
+      return 1;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    a[i] = 0;
+  
+  _Cilk_for (i = (ARRAY_SIZE-1); i >= 0; --i) 
+    a[i] = i;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+    if (a[i] != i)
+      return 1;
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_errors.c
new file mode 100644
index 0000000..c6d6656
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_errors.c
@@ -0,0 +1,61 @@
+struct foo
+{
+  int x,y,z;
+  char q;
+};
+
+int main (void)
+{
+  int q = 0, ii = 0, jj = 0;
+  volatile int vii = 0;
+  static int sii = 0;
+  register int rii = 0;
+  extern int eii;
+  struct foo something, nothing;
+  float fii = 0;
+  _Cilk_for (ii; ii < 10; ii++) /* { dg-error " expected induction variable initialization" } */
+    q = 5;
+
+  _Cilk_for (; ii < 10; ii++) /* { dg-error "expected induction variable initialization" } */
+    q = 2;
+
+  _Cilk_for (ii = 0; ; ii++) /* { dg-error "missing condition" } */
+    q = 2;
+
+  _Cilk_for (ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+    q = 5;
+
+  _Cilk_for (ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
+    q = 5;
+
+  _Cilk_for (ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected ';' before ',' token" } */
+    q = 5;
+
+  _Cilk_for (vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
+    q = 5;
+
+  _Cilk_for (sii = 0; sii < 10; sii++) /* { dg-error "induction variable cannot be static" } */
+    q = 5;
+
+  _Cilk_for (rii = 0; rii < 10; rii++) /* { dg-error "induction variable cannot be declared register" } */
+    q = 5;
+
+  _Cilk_for (something = nothing; ii < 10; ii++) /* { dg-error "induction variable must be of integral or" } */
+    q = 5;
+
+  _Cilk_for (fii = 3.47; fii < 5.23; ii++) /* { dg-error "induction variable must be of integral or pointer type" } */
+    q = 5;
+
+  _Cilk_for (ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
+    q = 5;
+
+  _Cilk_for (ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */
+    q = 5;
+
+  _Cilk_for (ii = 10; ii >= 0; ii--) /* This is OK!  */
+    q = 5;
+
+  _Cilk_for (ii; ii < 10; ii++) /* { dg-error "expected induction variable initialization" } */
+    q = 5;
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain.c
new file mode 100644
index 0000000..4c86bf6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain.c
@@ -0,0 +1,34 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus -w" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int grain_value = 2;
+int main (void)
+{
+  int Array1[200], Array1_Serial[200];
+  int ii = 0; 
+
+  for (ii = 0; ii < 200; ii++)
+    {
+      Array1_Serial[ii] = 2;
+      Array1[ii] = 1;
+    }
+
+#pragma cilk grainsize = 2
+  _Cilk_for (ii = 0; ii < 200; ii++)
+    Array1[ii] = 2;
+
+  for (ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+#pragma cilk grainsize = grain_value
+  _Cilk_for (ii = 0; ii < 200; ii++) 
+    Array1[ii] = 2;
+
+  for (ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain_errors.c
new file mode 100644
index 0000000..36d75df
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_grain_errors.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -fcilkplus" } */
+char Array1[26];
+
+#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */
+
+int main(int argc, char **argv)
+{
+  int ii = 0;
+
+/* This is OK.  */
+#pragma cilk grainsize = 2
+  _Cilk_for (ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+
+/* This is OK, it will do a type conversion to long int.  */
+#pragma cilk grainsize = 0.5 
+  _Cilk_for (ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_ptr_iter.c
new file mode 100644
index 0000000..8eec6be
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_ptr_iter.c
@@ -0,0 +1,28 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus -w" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+/* <feature> loop control variable must have integer, pointer or class type
+   </feature>
+*/
+
+#define ARRAY_SIZE 10000
+int a[ARRAY_SIZE];
+
+int main(void)
+{ 
+  int *aa = 0;
+  int ii = 0;
+
+  for (ii =0; ii < ARRAY_SIZE; ii++)
+    a[ii] = 5;
+
+  _Cilk_for(aa = a; aa < a + ARRAY_SIZE; aa++) 
+    *aa = 0;
+
+  for (ii = 0; ii < ARRAY_SIZE; ii++) 
+    if (a[ii] != 0) 
+      return 1;
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_warning.c b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_warning.c
new file mode 100644
index 0000000..f39eb7b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cilk-plus/CK/cilk_for_warning.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+int main (void)
+{
+  int ii = 0, q = 2;
+  _Cilk_for (ii = 0; ii < 10; ii++) /* { dg-warning "loop body modifies control variable" } */
+    ii += q;
+  
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp b/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
index 7407e8e..edab8eb 100644
--- a/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
+++ b/gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
@@ -60,5 +60,12 @@ if { [check_effective_target_lto] } {
     dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -flto -g -fcilkplus $ALWAYS_CFLAGS" " "
 }
 
-
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -g -fcilkplus $ALWAYS_CFLAGS " " "
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O1 -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O2 -std=c99 -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O2 -ftree-vectorize -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O3 -g -fcilkplus $ALWAYS_CFLAGS" " "
+if { [check_effective_target_lto] } {
+dg-runtest [lsort [glob -nocomplain $srcdir/gcc.dg/cilk-plus/CK/*.c]] " -O3 -flto -g -fcilkplus $ALWAYS_CFLAGS" " "
+}
 dg-finish
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7fe849d..de2a24b 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -2661,6 +2661,29 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
       pp_string (buffer, "_Cilk_sync");
       break;
 
+    case CILK_FOR_STMT:
+      if (CILK_FOR_GRAIN (node))
+	{
+	  pp_string (buffer, "#pragma cilk grainsize = ");
+	  dump_generic_node (buffer, CILK_FOR_GRAIN (node), spc, flags, false); 
+	  newline_and_indent (buffer, spc);
+	}
+      pp_string (buffer, "_Cilk_for (");
+      dump_generic_node (buffer, CILK_FOR_INIT (node), spc, flags, false);
+      pp_string (buffer, "; ");
+      dump_generic_node (buffer, CILK_FOR_COND (node), spc, flags, false);
+      pp_string (buffer, "; ");
+      dump_generic_node (buffer, CILK_FOR_EXPR (node), spc, flags, false);
+      pp_string (buffer, ")");
+      newline_and_indent (buffer, spc + 2);
+      pp_left_brace (buffer);
+      newline_and_indent (buffer, spc + 4);
+      dump_generic_node (buffer, CILK_FOR_BODY (node), spc + 4, flags, false);
+      newline_and_indent (buffer, spc + 2);
+      pp_right_brace (buffer);
+      is_expr = false;
+      break;
+
     default:
       NIY;
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index 8eecba7..9c0bfe2 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1285,6 +1285,16 @@ DEFTREECODE (CILK_SPAWN_STMT, "cilk_spawn_stmt", tcc_statement, 1)
 /* Cilk Sync statement: Does not have any operands.  */
 DEFTREECODE (CILK_SYNC_STMT, "cilk_sync_stmt", tcc_statement, 0)
 
+/* Cilk for statement
+   Operand 0 is the initializer.
+   Operand 1 is the loop terminating condition.
+   Operand 2 is the increment/decrement expression.
+   Operand 3 is the loop-body.
+   Operand 4 is the scope.
+   Operand 5 is the induction variable.
+   Operand 6 is the grain that is passed in through a pragma.  */
+DEFTREECODE (CILK_FOR_STMT, "cilk_for_stmt", tcc_statement, 7)
+
 /*
 Local variables:
 mode:c
diff --git a/gcc/tree.h b/gcc/tree.h
index e58b3a5..000a448 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -804,6 +804,15 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
 /* Cilk keywords accessors.  */
 #define CILK_SPAWN_FN(NODE) TREE_OPERAND (CILK_SPAWN_STMT_CHECK (NODE), 0)
 
+/* CILK_FOR_STMT accessors.  */
+#define CILK_FOR_INIT(NODE)     TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 0)
+#define CILK_FOR_COND(NODE)     TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 1)
+#define CILK_FOR_EXPR(NODE)     TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 2)
+#define CILK_FOR_BODY(NODE)     TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 3)
+#define CILK_FOR_SCOPE(NODE)    TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 4)
+#define CILK_FOR_VAR(NODE)      TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 5)
+#define CILK_FOR_GRAIN(NODE)    TREE_OPERAND (CILK_FOR_STMT_CHECK (NODE), 6)
+
 /* In a RESULT_DECL, PARM_DECL and VAR_DECL, means that it is
    passed by invisible reference (and the TREE_TYPE is a pointer to the true
    type).  */

[-- Attachment #3: diff_cilk_for_c++.txt --]
[-- Type: text/plain, Size: 62016 bytes --]

diff --git a/gcc/cp/cp-cilk.c b/gcc/cp/cp-cilk.c
old mode 100644
new mode 100755
index 0da95e8..bd114c8
--- a/gcc/cp/cp-cilk.c
+++ b/gcc/cp/cp-cilk.c
@@ -23,8 +23,13 @@
 #include "system.h"
 #include "coretypes.h"
 #include "cp-tree.h"
+#include "tree.h"
 #include "tree-iterator.h"
 #include "cilk.h"
+#include "langhooks.h"
+#include "cgraph.h"
+#include "gimple.h"
+#include "gimplify.h"
 
 /* Sets the EXCEPTION bit (0x10) in the FRAME.flags field.  */
 
@@ -116,3 +121,483 @@ cilk_create_lambda_fn_tmp_var (tree lambda_fn)
   add_local_decl (cfun, return_var);
   return return_var;
 }
+
+/* Returns an overloaded function that does operation based on CODE using
+   OP0 and OP1.  If CRY is set to true, then the function complains when
+   it is unable to find an overloaded operator.  */
+
+static tree
+callable (location_t loc, enum tree_code code, tree op0, tree op1, bool cry)
+{
+  vec<tree, va_gc> *op1_vec = make_tree_vector_single (op1);
+  if (code == INIT_EXPR)
+    return build_special_member_call (NULL_TREE, complete_ctor_identifier,
+				      &op1_vec,
+				      TYPE_MAIN_VARIANT (TREE_TYPE (op1)), 0,
+				      cry);
+    
+  if (code == PSEUDO_DTOR_EXPR)
+    return build_special_member_call (NULL_TREE, complete_dtor_identifier,
+				      &op1_vec,
+				      TYPE_MAIN_VARIANT (TREE_TYPE (op1)), 0,
+				      cry);
+
+  int flags = LOOKUP_PROTECT | LOOKUP_ONLYCONVERTING;
+  tree exp = build_new_op (EXPR_LOCATION (op1), code, flags, op0, op1,
+			   NULL_TREE, NULL, 0);
+  if (exp == error_mark_node)
+    exp = build_x_modify_expr (EXPR_LOCATION (op1), op0, code, op1, tf_none);
+  if (exp && exp != error_mark_node)
+    return exp;
+
+  const char *op = operator_name_info[(int) code].name;
+  const char *explain = cry ? "" : "accessible, unambiguous";
+  if (op1) 
+    error_at (loc, "No%s operator%s(%T,%T) for _Cilk_for loop", explain, op, 
+	      TREE_TYPE (op0), TREE_TYPE (op1)); 
+  else 
+    error_at (loc, "No%s operator%s(%T,%T) for _Cilk_for loop", explain, op, 
+	      TREE_TYPE (op0), TREE_TYPE (op0));
+  return NULL_TREE;
+}
+
+/* Calculates the COUNT_UP and/or COUNT_DOWN values for a _Cilk_for loop using
+   its characteristics stored in *CFD.  */
+
+static void
+calc_count_up_count_down (struct cilk_for_desc *cfd, tree *count_up,
+			  tree *count_down)
+{
+  /* Reasoning for high and low variables can be found in
+     cilk_compute_loop_count in c-family/cilk.c.  */
+  tree high = cfd->end_var ? cfd->end_var : cfd->end_expr;
+  tree low = cfd->lower_bound ? cfd->lower_bound : cfd->var;
+
+  /* When these are invalid, we flag them in cilk_compute_loop_var.  This
+     condition is a bit rare.  */
+  if (high == error_mark_node || low == error_mark_node)
+    return;
+  
+  /* Only call this function if we are using an iterator.  */
+  gcc_assert (cfd->iterator);
+  
+  if (TREE_CODE (high) == TARGET_EXPR)
+    high = TARGET_EXPR_INITIAL (high);
+  if (TREE_CODE (low) == TARGET_EXPR)
+    low = TARGET_EXPR_INITIAL (low);
+  
+  if (TREE_CODE (low) == TREE_LIST)
+    low = TREE_VALUE (low);
+  high = cilk_tree_operand_noconv (high);
+  if (cfd->direction >= 0)
+    {
+      *count_up = build_x_binary_op (cfd->loc, MINUS_EXPR, high,
+				     TREE_CODE (high), low, TREE_CODE (low),
+				     NULL, tf_warning_or_error);
+      /* We should have already failed if this operator is not callable.  */
+      gcc_assert (*count_up != error_mark_node);
+    }
+  else
+    {
+      *count_down = build_x_binary_op (cfd->loc, MINUS_EXPR, low,
+				       TREE_CODE (low), high, TREE_CODE (high),
+				       NULL, tf_warning_or_error);
+      /* ...same reasoning as count up for the assert below.  */
+      gcc_assert (*count_down != error_mark_node);
+    }
+}
+
+/* Handler for iterator to compute the loop variable.  ADD_OP indicates
+   whether we need a '+' or '-' operation. LOW indicates the starting point
+   and LOOP_VAR is the induction variable.  Returns an expression (or a
+   STATEMENT_LIST of expressions).  If it is unable to find the appropriate
+   iteration, then it returns an error mark node and its parent will set
+   the loop as invalid.  */
+
+static tree
+compute_loop_var_cp_iter_hdl (location_t loc, enum tree_code add_op,
+			      tree low, tree loop_var, tree var2)
+{
+  tree exp = build_new_op (loc, add_op, 0, low, loop_var, NULL_TREE, 0,
+			   tf_none);
+  if (exp == error_mark_node)
+    {
+      /* If we are here then operator+ or operator- could not be found.
+	 So, the other option is to use +=.  This requires storing values
+	 in the variable and then adding them one by one.  */
+      tree new_var = var2;
+      exp = alloc_stmt_list ();
+      tree new_stmt = build_x_modify_expr (loc, new_var, INIT_EXPR,
+					   build_zero_cst (TREE_TYPE (new_var)),
+					   tf_warning_or_error);
+      if (new_stmt == error_mark_node)
+	return error_mark_node;
+      append_to_statement_list (new_stmt, &exp);
+      new_stmt = build_x_modify_expr (loc, new_var, NOP_EXPR, low,
+				      tf_warning_or_error);
+      if (new_stmt == error_mark_node)
+	return error_mark_node;
+      append_to_statement_list (new_stmt, &exp);
+      new_stmt = build_x_modify_expr (loc, new_var, add_op, loop_var,
+				      tf_warning_or_error);
+      if (new_stmt == error_mark_node)
+	return error_mark_node;
+      append_to_statement_list (new_stmt, &exp);
+      return exp;
+    }
+  exp = cp_build_modify_expr (var2, INIT_EXPR, exp, tf_warning_or_error);
+  return exp;
+}
+
+/* Returns the body of the nested function for a _Cilk_for using the loop's
+   characteristic information from CFD.  The returned tree will be a
+   STATEMENT LIST.  */
+
+static tree
+cp_create_cilk_for_body (struct cilk_for_desc *cfd)
+{
+  push_function_context ();
+  declare_cilk_for_parms (cfd);
+  cfd->wd.fntype = build_function_type (void_type_node, cfd->wd.argtypes);
+
+  tree fndecl = cilk_create_cilk_helper_decl (&cfd->wd);
+  fndecl = build_lang_decl (FUNCTION_DECL, DECL_NAME (fndecl), cfd->wd.fntype);
+  if (cfd->nested_ok)
+    DECL_CONTEXT (fndecl) = current_function_decl;
+  else
+    DECL_CONTEXT (fndecl) = DECL_CONTEXT (current_function_decl);
+
+  tree outer = current_function_decl;
+  SET_DECL_LANGUAGE (fndecl, lang_c);
+  start_preparsed_function (fndecl, NULL_TREE, SF_PRE_PARSED);
+
+  declare_cilk_for_vars (cfd, fndecl);
+  
+  tree lower_bound = cfd->lower_bound;
+  struct gimplify_ctx gctx;
+
+  tree body = begin_compound_stmt (BCS_FN_BODY);
+  push_gimplify_context (&gctx);
+
+  gimple_add_tmp_var (cfd->var2);
+
+  /* Get the lower bound into a variable unless it is a constant or a
+     non-copyable value.  If non-copyable value, then reference value from
+     the outer frame.  */
+  if (!lower_bound)
+    {
+      lower_bound = cfd->var;
+      tree hack = build_decl (cfd->loc, VAR_DECL, NULL_TREE,
+			      TREE_TYPE (lower_bound));
+      DECL_CONTEXT (hack) = DECL_CONTEXT (lower_bound);
+      *pointer_map_insert (cfd->wd.decl_map, hack) = lower_bound;
+      lower_bound = hack;
+    }
+  tree cast_max_expr, count_type, pre, loop_var;
+  if (INTEGRAL_TYPE_P (cfd->var_type))
+    {
+      loop_var = create_tmp_var (cfd->var_type, NULL);
+      count_type = cfd->var_type;
+      tree cvt_expr = cp_fold_convert (cfd->var_type, cfd->min_parm);
+      pre = build_x_modify_expr (cfd->loc, loop_var, NOP_EXPR, cvt_expr,
+				 tf_warning_or_error);
+      cast_max_expr = cp_fold_convert (count_type, cfd->max_parm);
+    }
+  else
+    {
+      loop_var = create_tmp_var (TREE_TYPE (cfd->min_parm), NULL);
+      count_type = cfd->count_type;
+      pre = fold_build2 (INIT_EXPR, void_type_node, loop_var, cfd->min_parm);
+      cast_max_expr = cfd->max_parm;
+    }
+
+  tree loop_body = alloc_stmt_list ();
+  
+  /* Concat. the control variable initialization with the loop body.
+     Do not call gimplify_and_add to append to list because we need
+     to wrap the entire list in a cleanup point expr to delay destruction
+     of the control variable to the end of the loop if it is an iterator.  */
+  tree loop_end_comp = cilk_compute_loop_var (cfd, loop_var, lower_bound,
+					      compute_loop_var_cp_iter_hdl);
+  if (loop_end_comp == error_mark_node)
+    {
+      cfd->invalid = true;
+      return error_mark_node;
+    }
+  append_to_statement_list (loop_end_comp, &loop_body);
+  tree cleanup = cxx_maybe_build_cleanup (cfd->var2, tf_none);
+  if (cleanup)
+    {
+      append_to_statement_list (cfd->body, &loop_body);
+      append_to_statement_list (cleanup, &loop_body);
+    }
+  else
+    append_to_statement_list (cfd->body, &loop_body);
+
+  loop_body = fold_build_cleanup_point_expr (void_type_node, loop_body);
+  DECL_SEEN_IN_BIND_EXPR_P (cfd->var2) = 1;
+
+  cfd->wd.context = outer;
+  bool throws = flag_exceptions ? cp_function_chain->can_throw : false;
+  cilk_outline_body (fndecl, &loop_body, &cfd->wd, &throws);
+  cp_function_chain->can_throw = throws;
+  
+  /* We have to manually create this loop for two reasons:
+     a. We need to have access to continue and start label since we need
+        to resolve continue and breaks by hand.
+     b. C++ doesn't provide a c_finish_loop function like C does.  */
+  tree c_for_loop = push_stmt_list ();
+  tree slab = build_decl (cfd->loc, LABEL_DECL, NULL_TREE, void_type_node);
+  DECL_ARTIFICIAL (slab) = 0;
+  DECL_IGNORED_P (slab) = 1;
+  DECL_CONTEXT (slab) = fndecl;
+  tree top_label = build1 (LABEL_EXPR, void_type_node, slab);
+
+  tree cont_lab = build_decl (cfd->loc, LABEL_DECL, NULL_TREE, void_type_node);
+  DECL_ARTIFICIAL (cont_lab) = 0;
+  DECL_IGNORED_P (cont_lab) = 1;
+  DECL_CONTEXT (cont_lab) = fndecl;
+
+  tree continue_label = build1 (LABEL_EXPR, void_type_node, cont_lab);
+  tree loop_cond = fold_build2 (LT_EXPR, boolean_type_node, loop_var,
+				cast_max_expr);
+  tree cond_expr = build3 (COND_EXPR, void_type_node, loop_cond,
+			   build1 (GOTO_EXPR, void_type_node, slab),
+			   build_empty_stmt (cfd->loc));
+  tree mod_expr = fold_build2 (MODIFY_EXPR, void_type_node, loop_var,
+				build2 (PLUS_EXPR, count_type, loop_var,
+					build_one_cst (count_type)));
+  add_stmt (pre);
+  add_stmt (top_label);
+  add_stmt (loop_body);
+  add_stmt (continue_label);
+  add_stmt (mod_expr);
+  add_stmt (cond_expr);
+  pop_stmt_list (c_for_loop);
+
+  /* Resolve all the continues in the _Cilk_for body here.  */
+  walk_tree (&c_for_loop, cilk_resolve_continue_stmts, (void *) cont_lab, NULL);
+  add_stmt (c_for_loop);
+
+  DECL_INITIAL (fndecl) = make_node (BLOCK);
+  TREE_USED (DECL_INITIAL (fndecl)) = 1;
+  BLOCK_VARS (DECL_INITIAL (fndecl)) = loop_var;
+  TREE_CHAIN (loop_var) = cfd->var2;
+
+  body = build3 (BIND_EXPR, void_type_node, loop_var, body,
+		 DECL_INITIAL (fndecl));
+  DECL_CONTEXT (cfd->var2) = fndecl;
+  pop_gimplify_context (0);
+
+  finish_function_body (body);
+  
+  /* A nested function canot be expanded or deferred until its parent is done.
+     So, don't call expand_or_defer_fn here.  A non-nested function must be
+     done here.  */
+  if (!cfd->nested_ok)
+    expand_or_defer_fn (fndecl);
+  
+  pop_function_context ();
+  return fndecl;
+}
+
+/* Creates a nested function for the _Cilk_for statement using its information
+   in CFD.  PRE_P is the preceeding gimple trees function.  */
+
+static tree
+create_cilk_for_nested_fn (struct cilk_for_desc *cfd, gimple_seq *pre_p)
+{
+  tree var = cfd->var;
+  DECL_CONTEXT (var) = current_function_decl;
+
+  if (POINTER_TYPE_P (TREE_TYPE (var)))
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_WRITE);
+  else
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_READ);
+
+  tree incr = cfd->incr;
+
+  /* If the loop increment is not an integer constant and is not a DECL,
+     copy it to a temporary.  if it is modified during the loop the behavior
+     is undefined.  Races could be avoided by copying it to a temporary
+     variable.  */
+  if (TREE_CODE (incr) != INTEGER_CST && !DECL_P (incr))
+    {
+      incr = get_formal_tmp_var (incr, pre_p);
+      cfd->incr = incr;
+    }
+
+  if (DECL_P (incr) && !TREE_STATIC (incr) && !DECL_EXTERNAL (incr))
+    *pointer_map_insert (cfd->wd.decl_map, incr) = incr;
+
+  /* Map the loop variable to integer_minus_one_node if we won't really be
+     passing it into hte loop body.  Otherwise map to integer_zero_node.  */
+  *pointer_map_insert (cfd->wd.decl_map, var) =
+    (void *) (cfd->lower_bound ? integer_minus_one_node : integer_zero_node);
+  cilk_extract_free_variables (cfd->body, &cfd->wd, ADD_READ);
+
+  tree fn = cp_create_cilk_for_body (cfd);
+
+  /* One of the reasons why FN is error_mark_node is because the function
+     couldn't find the appropriate overloaded operation.  */
+  if (fn == error_mark_node)
+    return error_mark_node;
+
+  DECL_UNINLINABLE (fn) = 1;
+  DECL_STATIC_CHAIN (fn) = 1;
+
+  current_function_decl = fn;
+  /* Genericize the _Cilk_for body, mainly split up the _Cilk_for body and
+     the for-loop we inserted.  */
+  cp_genericize (fn);
+  return fn;
+}
+
+/* Helper function to gimplify a CILK_FOR_STMT.  CFD holds all the values
+   extracted a CILK_FOR_STMT and *PRE_P is the preceeding sequence.  */
+
+static void
+gimplify_cilk_for_1 (struct cilk_for_desc cfd, gimple_seq *pre_p)
+{
+  bool order_variable = false;
+  tree parent_function = current_function_decl;
+  
+  if (TREE_SIDE_EFFECTS (cfd.end_expr))
+    {
+      enum tree_code ecode = TREE_CODE (cfd.end_expr);
+      if (ecode == INIT_EXPR || ecode == MODIFY_EXPR)
+	cfd.end_var = TREE_OPERAND (cfd.end_expr, 0);
+      else if (ecode == TARGET_EXPR)
+	{
+	  cfd.end_var = TARGET_EXPR_INITIAL (cfd.end_expr);
+	  if (TREE_CODE (cfd.end_var) == AGGR_INIT_EXPR)
+	    cfd.end_var = TARGET_EXPR_SLOT (cfd.end_expr);
+	  else
+	    cfd.end_var = get_formal_tmp_var (cfd.end_var, pre_p);
+	}
+      else if (ecode == CALL_EXPR)
+	cfd.end_var = cfd.end_expr;
+      else
+	{
+	  tree ii_tree = cfd.end_expr;
+	  while (TREE_CODE_CLASS (TREE_CODE (ii_tree)) == tcc_unary)
+	    ii_tree = TREE_OPERAND (ii_tree, 0);
+	  if (TREE_CODE (ii_tree) == ADDR_EXPR)
+	    ii_tree = TREE_OPERAND (ii_tree, 0);
+	  ecode = TREE_CODE (ii_tree);
+	  tree tmp_var = cilk_tree_operand_noconv (cfd.end_expr);
+	  cfd.end_var = get_formal_tmp_var (tmp_var, pre_p);
+	  order_variable = true;
+	}
+    }
+  tree cond = cfd.cond;
+  tree op1 = TREE_OPERAND (cond, 1);
+  tree op0 = TREE_OPERAND (cond, 0);
+  enum tree_code cond_code = TREE_CODE (cond);
+
+  /* In this case below, we have an overloaded boolean comparison operation.  */
+  if (cond_code == CALL_EXPR)
+    {
+      cond_code = cilk_find_code_from_call (CALL_EXPR_FN (cond));
+      op1 = cilk_tree_operand_noconv (CALL_EXPR_ARG (cond, 1));
+      op0 = cilk_tree_operand_noconv (CALL_EXPR_ARG (cond, 0));
+      if (TREE_CODE (op0) == ADDR_EXPR || TREE_CODE (op0) == INDIRECT_REF)
+	op0 = TREE_OPERAND (op0, 0);
+    }
+  if (order_variable && op1 == cfd.end_expr)
+    op1 = cfd.end_var;
+  else if (order_variable && op0 == cfd.end_expr)
+    op0 = cfd.end_var;
+  
+  cond = callable (cfd.loc, cond_code, op0, op1, false);
+  gcc_assert (cond != NULL_TREE);
+
+  if (TREE_CODE (TREE_TYPE (cond)) != BOOLEAN_TYPE)
+    cond = perform_implicit_conversion (boolean_type_node, cond,
+					tf_warning_or_error);
+  enum tree_code div_op = NOP_EXPR;
+  tree forward = NULL_TREE, count_up = NULL_TREE, count_down = NULL_TREE;
+  cilk_calc_forward_div_op (&cfd, &div_op, &forward);
+  if (cfd.iterator)
+    calc_count_up_count_down (&cfd, &count_up, &count_down);
+  
+  tree count = cilk_compute_loop_count (&cfd, div_op, forward, count_up,
+					count_down);
+  tree fn = create_cilk_for_nested_fn (&cfd, pre_p);
+  if (fn == error_mark_node)
+    return;
+  cfd.cond = cond;
+  
+  current_function_decl = parent_function;
+  gimple_seq inner_seq = insert_cilk_for_nested_fn (&cfd, count, fn);
+  gimple_seq_add_seq (pre_p, inner_seq);
+}
+
+/* Extract all the relevant information from CFOR, a CILK_FOR_STMT tree
+   and store them in CFD structure.  */
+
+static void
+cp_extract_cilk_for_fields (struct cilk_for_desc *cfd, tree cfor)
+{
+  cfd->var = CILK_FOR_VAR (cfor);
+  cfd->cond = CILK_FOR_COND (cfor);
+  cfd->lower_bound = CILK_FOR_INIT (cfor);
+  cfd->incr = CILK_FOR_EXPR (cfor);
+  cfd->loc = EXPR_LOCATION (cfor);
+  cfd->body = CILK_FOR_BODY (cfor);
+  cfd->grain = CILK_FOR_GRAIN (cfor);
+  cfd->invalid = false;
+
+  /* This function shouldn't be setting these two variables.  */
+  cfd->ctx_arg = NULL_TREE;
+  cfd->count = NULL_TREE;
+  
+  cilk_set_init_info (cfd);
+  cilk_set_inclusive_and_direction (cfd);
+  cilk_set_iter_difftype (cfd);
+
+  if (cfd->iterator)
+    {
+      tree exp = NULL_TREE;
+      tree hack = build_decl (cfd->loc, VAR_DECL, NULL_TREE,
+			      TREE_TYPE (cfd->var));
+      if (cfd->direction >= 0)
+	exp = callable (cfd->loc, MINUS_EXPR, hack, cfd->var,true);
+      else
+	exp = callable (cfd->loc, MINUS_EXPR, cfd->var, hack, true);
+      if (!exp) 
+	{ 
+	  cfd->invalid = true;
+	  return;
+	}
+      cfd->difference_type = TYPE_MAIN_VARIANT (TREE_TYPE (exp));
+    }
+  cfd->count_type = cilk_check_loop_difference_type (cfd->difference_type);
+  cilk_set_incr_info (cfd, true);
+}
+
+/* Entry function to gimplify a CILK_FOR_STMT, *FOR_P.  *PRE_P and *POST_P are
+    preceeding and proceeding gimple sequences of *FOR_P, respectively.  */
+
+int
+cp_gimplify_cilk_for (tree *for_p, gimple_seq *pre_p,
+		      gimple_seq *post_p ATTRIBUTE_UNUSED)
+{
+  struct cilk_for_desc cfd;
+
+  cfun->is_cilk_function = 1;
+  cilk_init_cfd (&cfd);
+
+  cp_extract_cilk_for_fields (&cfd, *for_p);
+  if (cfd.invalid)
+    {
+      *for_p = build_empty_stmt (cfd.loc);
+      return GS_ERROR;
+    }
+  cfd.nested_ok = !DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (current_function_decl);
+  gimplify_cilk_for_1 (cfd, pre_p);
+  *for_p = NULL_TREE;
+
+  return GS_ALL_DONE;
+}
+
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index c464719..b40e9a6 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -269,6 +269,23 @@ genericize_cp_loop (tree *stmt_p, location_t start_locus, tree cond, tree body,
   *stmt_p = stmt_list;
 }
 
+/* Genericize a CILK_FOR_STMT node *STMT_P.  */
+
+static void
+genericize_cilk_for_stmt (tree *stmt_p, int *walk_subtrees, void *data)
+{
+  tree stmt = *stmt_p;
+  cp_walk_tree (&CILK_FOR_COND (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_INIT (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_GRAIN (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_VAR (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_EXPR (stmt), cp_genericize_r, data, NULL);
+
+  /* _Cilk_for body will be resolved after it is inserted into a nested
+     function.  */
+  *walk_subtrees = 0;
+} 
+
 /* Genericize a FOR_STMT node *STMT_P.  */
 
 static void
@@ -1121,6 +1138,8 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void *data)
     gcc_assert (!CONVERT_EXPR_VBASE_PATH (stmt));
   else if (TREE_CODE (stmt) == FOR_STMT)
     genericize_for_stmt (stmt_p, walk_subtrees, data);
+  else if (TREE_CODE (stmt) == CILK_FOR_STMT)
+    genericize_cilk_for_stmt (stmt_p, walk_subtrees, data);
   else if (TREE_CODE (stmt) == WHILE_STMT)
     genericize_while_stmt (stmt_p, walk_subtrees, data);
   else if (TREE_CODE (stmt) == DO_STMT)
diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h
index 77a66c3..baf3ee3 100644
--- a/gcc/cp/cp-objcp-common.h
+++ b/gcc/cp/cp-objcp-common.h
@@ -164,4 +164,7 @@ extern void cp_common_init_ts (void);
 #undef  LANG_HOOKS_CILKPLUS_FRAME_CLEANUP
 #define LANG_HOOKS_CILKPLUS_FRAME_CLEANUP cp_cilk_install_body_wframe_cleanup
 
+#undef  LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR
+#define LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR cp_gimplify_cilk_for
+
 #endif /* GCC_CP_OBJCP_COMMON */
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 351158e..ee045b8 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5686,6 +5686,10 @@ extern void finish_for_init_stmt		(tree);
 extern void finish_for_cond			(tree, tree, bool);
 extern void finish_for_expr			(tree, tree);
 extern void finish_for_stmt			(tree);
+extern tree begin_cilk_for_stmt                 (tree, tree);
+extern void finish_cilk_for_init_stmt           (tree);
+extern tree finish_cilk_for_stmt                (tree);
+extern tree finish_cilk_for_cond                (tree);
 extern tree begin_range_for_stmt		(tree, tree);
 extern void finish_range_for_decl		(tree, tree, tree);
 extern void finish_range_for_stmt		(tree);
@@ -6192,6 +6196,8 @@ extern int gimplify_cilk_spawn                  (tree *, gimple_seq *,
 /* In cp/cp-cilk.c */
 extern void cp_cilk_install_body_wframe_cleanup (tree, tree);
 extern tree cilk_create_lambda_fn_tmp_var       (tree);
+extern int cp_gimplify_cilk_for                 (tree *, gimple_seq *,
+						 gimple_seq *);
 /* -- end of C++ */
 
 #endif /* ! GCC_CP_TREE_H */
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index ced596e..ae03c56 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -1542,6 +1542,7 @@ begin_scope (scope_kind kind, tree entity)
     case sk_try:
     case sk_catch:
     case sk_for:
+    case sk_cilk_for:
     case sk_cond:
     case sk_class:
     case sk_scoped_enum:
diff --git a/gcc/cp/name-lookup.h b/gcc/cp/name-lookup.h
index 57641a1..66d1876 100644
--- a/gcc/cp/name-lookup.h
+++ b/gcc/cp/name-lookup.h
@@ -107,6 +107,8 @@ typedef enum scope_kind {
   sk_catch,	     /* A catch-block.  */
   sk_for,	     /* The scope of the variable declared in a
 			for-init-statement.  */
+  sk_cilk_for,       /* The scope of the variable declared in _Cilk_for init
+			statement.  */
   sk_cond,	     /* The scope of the variable declared in the condition
 			of an if or switch statement.  */
   sk_function_parms, /* The scope containing function parameters.  */
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 29be9a8..0b5621d 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -235,6 +235,10 @@ static tree cp_literal_operator_id
 
 static void cp_parser_cilk_simd
   (cp_parser *, cp_token *);
+static tree cp_parser_cilk_for
+  (cp_parser *, tree);
+static void cp_parser_cilk_grainsize
+  (cp_parser *, cp_token *);
 static bool cp_parser_omp_declare_reduction_exprs
   (tree, cp_parser *);
 
@@ -2060,7 +2064,8 @@ static tree cp_parser_decltype
 /* Declarators [gram.dcl.decl] */
 
 static tree cp_parser_init_declarator
-  (cp_parser *, cp_decl_specifier_seq *, vec<deferred_access_check, va_gc> *, bool, bool, int, bool *, tree *);
+  (cp_parser *, cp_decl_specifier_seq *, vec<deferred_access_check, va_gc> *,
+   bool, bool, int, bool *, tree *, tree *);
 static cp_declarator *cp_parser_declarator
   (cp_parser *, cp_parser_declarator_kind, int *, bool *, bool);
 static cp_declarator *cp_parser_direct_declarator
@@ -9350,6 +9355,7 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 
 	case RID_WHILE:
 	case RID_DO:
+	case RID_CILK_FOR:
 	case RID_FOR:
 	  statement = cp_parser_iteration_statement (parser, false);
 	  break;
@@ -10505,6 +10511,17 @@ cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
       }
       break;
 
+    case RID_CILK_FOR:
+      if (!flag_enable_cilkplus)
+	{ 
+	  error_at (token->location, 
+		    "-fcilkplus must be enabled t use %<_Cilk_for%>");
+	  statement = error_mark_node;
+	}
+      else
+	statement = cp_parser_cilk_for (parser, NULL_TREE);
+      break;
+
     default:
       cp_parser_error (parser, "expected iteration-statement");
       statement = error_mark_node;
@@ -10624,9 +10641,15 @@ cp_parser_jump_statement (cp_parser* parser)
 	case IN_OMP_FOR:
 	  error_at (token->location, "break statement used with OpenMP for loop");
 	  break;
+
 	case IN_CILK_SIMD_FOR:
 	  error_at (token->location, "break statement used with Cilk Plus for loop");
 	  break;
+
+	case IN_CILK_FOR_STMT:
+	  error_at (token->location,
+		    "break statement used in _Cilk_for loop body");
+	  break;
 	}
       cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
       break;
@@ -10642,6 +10665,7 @@ cp_parser_jump_statement (cp_parser* parser)
 		    "continue statement within %<#pragma simd%> loop body");
 	  /* Fall through.  */
 	case IN_ITERATION_STMT:
+	case IN_CILK_FOR_STMT:
 	case IN_OMP_FOR:
 	  statement = finish_continue_stmt ();
 	  break;
@@ -11188,7 +11212,7 @@ cp_parser_simple_declaration (cp_parser* parser,
 					/*member_p=*/false,
 					declares_class_or_enum,
 					&function_definition_p,
-					maybe_range_for_decl);
+					maybe_range_for_decl, NULL);
       /* If an error occurred while parsing tentatively, exit quickly.
 	 (That usually happens when in the body of a function; each
 	 statement is treated as a declaration-statement until proven
@@ -16439,7 +16463,8 @@ cp_parser_init_declarator (cp_parser* parser,
 			   bool member_p,
 			   int declares_class_or_enum,
 			   bool* function_definition_p,
-			   tree* maybe_range_for_decl)
+			   tree* maybe_range_for_decl,
+			   tree* init)
 {
   cp_token *token = NULL, *asm_spec_start_token = NULL,
            *attributes_start_token = NULL;
@@ -16447,7 +16472,9 @@ cp_parser_init_declarator (cp_parser* parser,
   tree prefix_attributes;
   tree attributes = NULL;
   tree asm_specification;
-  tree initializer;
+  /* Initialize initalizer to remove a "using potentially unset variable"
+     warning/error.  */
+  tree initializer = NULL_TREE;
   tree decl = NULL_TREE;
   tree scope;
   int is_initialized;
@@ -16584,7 +16611,8 @@ cp_parser_init_declarator (cp_parser* parser,
 	      DECL_STRUCT_FUNCTION (decl)->function_start_locus
 		= func_brace_location;
 	    }
-
+	  if (init)
+	    *init = initializer;
 	  return decl;
 	}
     }
@@ -16819,6 +16847,8 @@ cp_parser_init_declarator (cp_parser* parser,
 	finish_fully_implicit_template (parser, /*member_decl_opt=*/0);
     }
 
+  if (init)
+    *init = initializer;
   return decl;
 }
 
@@ -22984,6 +23014,7 @@ cp_parser_single_declaration (cp_parser* parser,
 				        member_p,
 				        declares_class_or_enum,
 				        &function_definition_p,
+					NULL,
 					NULL);
 
     /* 7.1.1-1 [dcl.stc]
@@ -31253,6 +31284,21 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
       cp_parser_cilk_simd (parser, pragma_tok);
       return true;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (context == pragma_external)
+	{
+	  error_at (pragma_tok->location,
+		    "%<#pragma cilk grainsize%> may only be be used inside a "
+		    "function");
+	  break;
+	}
+
+      /* Ignore the pragma if Cilk Plus is not enabled.  */
+      if (flag_enable_cilkplus)
+	{
+	  cp_parser_cilk_grainsize (parser, pragma_tok);
+	  return true;
+	}
     default:
       gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
       c_invoke_pragma_handler (id);
@@ -31569,6 +31615,213 @@ cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token)
   return;
 }
 
+static tree
+cp_parser_cilk_for_init_statement (cp_parser *parser, tree *init)
+{
+  cp_token *token = cp_lexer_peek_token (parser->lexer);
+  location_t loc = token->location;
+  tree decl_init = NULL_TREE;
+  if (token->type == CPP_SEMICOLON)
+    {
+      error_at (loc, "expected induction variable");
+      return error_mark_node;
+    }
+
+  if (cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_REGISTER)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_EXTERN)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_MUTABLE)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_THREAD))
+    {
+      error_at (loc, "storage class is not allowed");
+      cp_lexer_consume_token (parser->lexer);
+    }
+
+  if (token->type == CPP_NAME)
+    {
+      tree type = cp_parser_lookup_name_simple (parser, token->u.value, loc);
+      if (TREE_CODE (type) == VAR_DECL || TREE_CODE (type) == PARM_DECL)
+	{
+	  error_at (loc, "_Cilk_for loop initializer must declare variable");
+	  cp_parser_skip_to_end_of_statement (parser);
+	  return error_mark_node;
+	}
+    }
+  int flags = 0;
+  cp_decl_specifier_seq specs;
+  cp_parser_decl_specifier_seq (parser, CP_PARSER_FLAGS_NONE, &specs, &flags);
+  tree decl = cp_parser_init_declarator (parser, &specs, NULL, false, false,
+					 flags, NULL, NULL, &decl_init);
+  /* Sometimes if the initial is constant, it won't save in DECL_INITIAL,
+     and thus we need to get the initial value.  Now, if it saved the
+     DECL_INITIAL value, then just use it since it will have all the
+     necessary type casting.  */
+  if (DECL_INITIAL (decl))
+      decl_init = DECL_INITIAL (decl);
+
+  
+  if (processing_template_decl)
+    add_stmt (decl_init);
+  else
+    *init = decl_init;
+  parser->scope = NULL_TREE;
+  parser->qualifying_scope = NULL_TREE;
+  parser->object_scope = NULL_TREE;
+
+  if (decl == error_mark_node || DECL_INITIAL (decl) == error_mark_node
+      || TREE_TYPE (decl) == error_mark_node)
+    {
+      cp_parser_skip_to_end_of_statement (parser);
+      gcc_assert (errorcount || sorrycount);
+      return error_mark_node;
+    }
+  return decl;
+}
+
+static void
+cp_parser_cilk_grainsize (cp_parser *parser, cp_token *pragma_tok)
+{
+  if (cp_parser_require (parser, CPP_EQ, RT_EQ))
+    {
+      tree exp = cp_parser_binary_expression (parser, false, false,
+					      PREC_NOT_OPERATOR, NULL);
+      cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+      if (!exp || exp == error_mark_node)
+	{
+	  error_at (pragma_tok->location, "invalid grainsize for _Cilk_for");
+	  return;
+	}
+      cp_token *n_tok = cp_lexer_peek_token (parser->lexer);
+
+      /* Make sure the next token is _Cilk_for, it is invalid otherwise.  */
+      if (n_tok && n_tok->type == CPP_KEYWORD && n_tok->keyword == RID_CILK_FOR)
+	{
+	  cp_lexer_consume_token (parser->lexer);
+	  tree cfor = cp_parser_cilk_for (parser, exp);
+	  if (cfor && STATEMENT_CODE_P (TREE_CODE (cfor)))
+	    SET_EXPR_LOCATION (cfor, n_tok->location);
+	}
+      else
+	warning (0, "%<#pragma cilk grainsize%> is not followed by "
+		 "%<_Cilk_for%>");
+      return;
+    }
+  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+}
+
+/* Top-level function to parse _Cilk_for and the for statement
+   following <#pragma simd>.  */
+
+static tree
+cp_parser_cilk_for (cp_parser *parser, tree grain)
+{
+  bool valid = true;
+  tree cond = NULL_TREE;
+  tree incr_expr = NULL_TREE;
+  tree init = NULL_TREE;
+  location_t loc = cp_lexer_peek_token (parser->lexer)->location;
+
+  tree scope = begin_for_scope (&init); 
+  tree statement = begin_cilk_for_stmt (scope, init);
+
+  if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+    {
+      cp_parser_skip_to_end_of_statement (parser);
+      return error_mark_node;
+    }
+
+  /* Parse initialization.  */ 
+  tree decl = cp_parser_cilk_for_init_statement (parser, &init);
+    
+  if (decl == error_mark_node)
+    valid = false;
+  else if (!decl || (TREE_CODE (decl) != VAR_DECL
+		     && TREE_CODE (decl) != DECL_EXPR))
+    {
+      error_at (loc, "_Cilk_for loop initializer does not declare a variable");
+      valid = false;
+      decl = error_mark_node;
+    }
+  if (cp_lexer_next_token_is (parser->lexer, CPP_COMMA))
+    {
+      error_at (loc, "_Cilk_for loop initializer cannot have multiple variable"
+		" declarations");
+      cp_parser_skip_to_end_of_statement (parser);
+      valid = false;
+    }
+
+  if (!valid)
+    /* Skip to the semicolon ending the init.  */
+    cp_parser_skip_to_end_of_statement (parser);
+  else
+    {
+      CILK_FOR_INIT (statement) = init;
+      CILK_FOR_VAR (statement) = decl;
+      finish_cilk_for_init_stmt (statement);
+    }
+
+  /* Parse condition.  */
+  if (!cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON))
+    return error_mark_node;
+  if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
+    {
+      error_at (loc, "missing condition");
+      cond = error_mark_node;
+    }
+  else
+    { 
+      cond = cp_parser_condition (parser);
+      cond = finish_cilk_for_cond (cond); 
+      CILK_FOR_COND (statement) = cond;
+    }
+
+  if (cond == error_mark_node)
+    valid = false;
+  cp_parser_consume_semicolon_at_end_of_statement (parser);
+
+  /* Parse increment.  */
+  if (cp_lexer_next_token_is (parser->lexer, CPP_CLOSE_PAREN))
+    {
+      error_at (loc, "missing increment");
+      incr_expr = error_mark_node;
+    }
+  else
+    incr_expr = cp_parser_expression (parser, false, NULL);
+  if (TREE_CODE (incr_expr) == ERROR_MARK)
+    {
+      cp_parser_skip_to_closing_parenthesis (parser, true, false, false);
+      valid = false;
+    }
+  if (!cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
+    {
+      cp_parser_skip_to_end_of_statement (parser);
+      valid = false;
+    }
+  
+  if (!valid)
+    {
+      gcc_assert (sorrycount || errorcount);
+      return error_mark_node;
+    }
+  
+  finish_for_expr (incr_expr, statement);
+  CILK_FOR_EXPR (statement) = incr_expr;
+  int saved_in_statement = parser->in_statement;
+  parser->in_statement = IN_CILK_FOR_STMT;
+  cp_parser_already_scoped_statement (parser);
+  parser->in_statement = saved_in_statement;
+  
+  CILK_FOR_GRAIN (statement) = grain;
+  statement = finish_cilk_for_stmt (statement);
+
+  /* Check if the body satisfies all the requirement of _Cilk_for.
+     If invalid, then just return error_mark_node.  */
+  if (statement == error_mark_node
+      || !cpp_validate_cilk_plus_loop (CILK_FOR_BODY (statement)))
+    return error_mark_node;
+  return statement;
+}
+
 /* Create an identifier for a generic parameter type (a synthesized
    template parameter implied by `auto' or a concept identifier). */
 
diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
index e26e350..8d1ce44 100644
--- a/gcc/cp/parser.h
+++ b/gcc/cp/parser.h
@@ -302,6 +302,8 @@ typedef struct GTY(()) cp_parser {
 #define IN_IF_STMT             16
 #define IN_CILK_SIMD_FOR       32
 #define IN_CILK_SPAWN          64
+#define IN_CILK_FOR_STMT       128
+  
   unsigned char in_statement;
 
   /* TRUE if we are presently parsing the body of a switch statement.
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 1b34434..302163f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13335,6 +13335,45 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
       finish_for_stmt (stmt);
       break;
 
+    case CILK_FOR_STMT:
+      {
+	stmt = begin_cilk_for_stmt (NULL_TREE, NULL_TREE);
+	CILK_FOR_INIT (stmt) = RECUR (CILK_FOR_INIT (t));
+	finish_cilk_for_init_stmt (stmt);
+	tmp = RECUR (CILK_FOR_VAR (t));
+	CILK_FOR_VAR (stmt) = tmp;
+	CILK_FOR_GRAIN (stmt) = CILK_FOR_GRAIN (t);
+
+	tmp = CILK_FOR_COND (t);
+	if (COMPARISON_CLASS_P (tmp))
+	  {
+	    tree op0 = RECUR (TREE_OPERAND (tmp, 0));
+	    tree op1 = RECUR (TREE_OPERAND (tmp, 1));
+	    tmp = build2 (TREE_CODE (tmp), boolean_type_node, op0, op1);
+	  }
+	CILK_FOR_COND (stmt) = tmp;
+
+	tmp = CILK_FOR_EXPR (t);
+	if (TREE_CODE (tmp) == MODIFY_EXPR)
+	  {
+	    tree lhs = TREE_OPERAND (tmp, 0);
+	    tree rhs = TREE_OPERAND (tmp, 1);
+	    lhs = RECUR (lhs);
+	    rhs = build2 (TREE_CODE (rhs), TREE_TYPE (lhs),
+			  RECUR (TREE_OPERAND (rhs, 0)),
+			  RECUR (TREE_OPERAND (rhs, 1)));
+	    tmp = build2 (MODIFY_EXPR, void_type_node, lhs, rhs);
+	  }
+	else
+	  tmp = build2 (TREE_CODE (tmp), void_type_node,
+			RECUR (TREE_OPERAND (tmp, 0)),
+			RECUR (TREE_OPERAND (tmp, 1)));
+	finish_for_expr (tmp, stmt);
+	RECUR (CILK_FOR_BODY (t));
+	stmt = finish_cilk_for_stmt (stmt);
+	CILK_FOR_GRAIN (stmt) = RECUR (CILK_FOR_GRAIN (t));	
+	break;
+      }
     case RANGE_FOR_STMT:
       {
         tree decl, expr;
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 5d68250..23565ae 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -826,7 +826,8 @@ finish_return_stmt (tree expr)
   return r;
 }
 
-/* Begin the scope of a for-statement or a range-for-statement.
+/* Begin the scope of a for-statement _Cilk_for statement 
+   or a range-for-statement.
    Both the returned trees are to be used in a call to
    begin_for_stmt or begin_range_for_stmt.  */
 
@@ -899,7 +900,7 @@ finish_for_cond (tree cond, tree for_stmt, bool ivdep)
 }
 
 /* Finish the increment-EXPRESSION in a for-statement, which may be
-   given by FOR_STMT.  */
+   given by FOR_STMT or CILK_FOR_STMT.  */
 
 void
 finish_for_expr (tree expr, tree for_stmt)
@@ -926,7 +927,10 @@ finish_for_expr (tree expr, tree for_stmt)
   expr = maybe_cleanup_point_expr_void (expr);
   if (check_for_bare_parameter_packs (expr))
     expr = error_mark_node;
-  FOR_EXPR (for_stmt) = expr;
+  if (TREE_CODE (for_stmt) == CILK_FOR_STMT)
+    CILK_FOR_EXPR (for_stmt) = expr;
+  else
+    FOR_EXPR (for_stmt) = expr;
 }
 
 /* Finish the body of a for-statement, which may be given by
@@ -6663,6 +6667,18 @@ finish_omp_cancellation_point (tree clauses)
   finish_expr_stmt (stmt);
 }
 \f
+
+/* Perform any canonicalization of the conditional in a Cilk for loop.  */
+tree
+finish_cilk_for_cond (tree cond)
+{
+  if (!processing_template_decl)
+    return cp_truthvalue_conversion (cond);
+  else
+    return cond;
+}
+\f
+
 /* Begin a __transaction_atomic or __transaction_relaxed statement.
    If PCOMPOUND is non-null, this is for a function-transaction-block, and we
    should create an extra compound stmt.  */
@@ -10615,4 +10631,51 @@ capture_decltype (tree decl)
   return type;
 }
 
+/* Begin a _Cilk_for-statement.  Returns a new FOR_STMT.  
+   SCOPE and INIT should be the return of begin_for_scope, 
+   or both NULL_TREE  */
+
+tree
+begin_cilk_for_stmt (tree scope, tree init)
+{
+  tree cilk_for_stmt = build_stmt (input_location, CILK_FOR_STMT, NULL_TREE,
+				   NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE,
+				   NULL_TREE, NULL_TREE);
+  if (scope == NULL_TREE)
+    {
+      if (!init)
+	scope = begin_for_scope (&init);
+    }
+  CILK_FOR_INIT (cilk_for_stmt) = init;
+  CILK_FOR_SCOPE (cilk_for_stmt) = scope;
+  return cilk_for_stmt;
+}
+
+/* Finish the for-init-statement of a for-statement, which may be given 
+   by C_FOR_STMT.  */
+
+void
+finish_cilk_for_init_stmt (tree c_for_stmt)
+{
+  if (processing_template_decl)
+    CILK_FOR_INIT (c_for_stmt) = pop_stmt_list (CILK_FOR_INIT (c_for_stmt));
+  CILK_FOR_BODY (c_for_stmt) = do_pushlevel (sk_block);
+}
+
+/* Finish the body of a for-statement, which may be given by FOR_STMT.  
+   Returns a CILK_FOR_STMT that is type checked.  */
+
+tree
+finish_cilk_for_stmt (tree cilk_for_stmt)
+{
+  CILK_FOR_BODY (cilk_for_stmt) = do_poplevel (CILK_FOR_BODY (cilk_for_stmt));
+  tree *scope_ptr = &CILK_FOR_SCOPE (cilk_for_stmt);
+  tree scope = *scope_ptr;
+  *scope_ptr = NULL;
+  add_stmt (do_poplevel (scope));
+  cp_finish_cilk_for_loop (&cilk_for_stmt, processing_template_decl);
+  add_stmt (cilk_for_stmt);
+  return cilk_for_stmt;
+}
+
 #include "gt-cp-semantics.h"
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc
new file mode 100644
index 0000000..dec650c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc
@@ -0,0 +1,42 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+
+int j[10];
+
+int main(void)
+{
+  int error = 0;
+  int j_serial[10];
+  for (int ii = 0; ii < 10; ii++)
+    {
+      j[ii] = 10;
+      j_serial[ii] = 10;
+    }
+  _Cilk_for (int ii = 5; ii < 10; ii++)
+    {
+      j[ii]=ii;
+    }
+
+  for (int ii = 5; ii < 10; ii++)
+    {
+      j_serial[ii] = ii;
+    }
+
+  for (int ii = 0; ii < 10; ii++)
+    {
+      if (j[ii] != j_serial[ii]) 
+	error = 1;    
+    }
+
+  if (error)
+    __builtin_abort ();
+  else
+    return 0;
+
+  return j[9];
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
new file mode 100644
index 0000000..8221371
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
@@ -0,0 +1,25 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define SIZE 100
+#define CHECK_VALUE 5
+
+template <class T>
+int func (T start, T end)
+{
+  int Array[SIZE];
+  _Cilk_for (T ii = 0; ii < end; ii++)
+    Array[ii] = CHECK_VALUE;
+  
+  for (T ii = 0; ii < end; ii++)
+    if (Array[ii] != CHECK_VALUE)
+      __builtin_abort ();
+
+  return 0;
+}
+
+int main (void)
+{
+  return func <int> (0, 100) + func <long> (0, 100);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for.cc
new file mode 100644
index 0000000..30ea29d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for.cc
@@ -0,0 +1,34 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int main(int argc, char **argv)
+{
+  char Array1[26], Array2[26];
+  char Array1_Serial[26], Array2_Serial[26];
+
+  for (int ii = 0; ii < 26; ii++)  
+    { 
+      Array1[ii] = 'A'+ii;
+      Array1_Serial[ii] = 'A'+ii;
+    }
+  for (int ii = 0; ii < 26; ii++)
+    {
+      Array2[ii] = 'a'+ii;
+      Array2_Serial[ii] = 'a'+ii;
+    }
+
+  _Cilk_for (int ii = 0 ; ii < 26; ii++) 
+    Array1[ii] = Array2[ii];
+
+  for (int ii = 0; ii < 26; ii++)
+    Array1_Serial[ii] = Array2_Serial[ii];
+
+  for (int ii = 0; ii < 26; ii++)  {
+    if (Array1_Serial[ii] != Array1[ii])  { 
+	__builtin_abort ();
+    }
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc
new file mode 100644
index 0000000..3759a36
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc
@@ -0,0 +1,22 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int q[10], seq[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii=max - 1; ii>=start; ii--) 
+	{ 
+	  for (int jj = 0; jj < 10; jj++)  
+	    {
+	      if (seq[jj] == 5)
+		continue;
+	      else
+		seq[jj] = 2;
+	    }
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc
new file mode 100644
index 0000000..38c4d51
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc
@@ -0,0 +1,19 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int q[10], seq2[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii=max - 1; ii>=start; ii--) 
+	{ 
+	  for (int jj = 0; jj < 10; jj++) 
+	    seq2[jj] = 5;
+	  continue;
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc
new file mode 100644
index 0000000..e68c700
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc
@@ -0,0 +1,18 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int q[10], seq2[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii = max - 1; ii >= start; ii--) 
+	{ 
+	  if (q[ii] != 0) 
+	    continue;
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc
new file mode 100644
index 0000000..17fd064
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc
@@ -0,0 +1,23 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int q[10], seq2[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii=max - 1; ii>=start; ii--) 
+	{ 
+	  int jj = 0;
+	  while (jj < 10)
+	    {
+	      seq2[jj] = 1;
+	      jj++;
+	    }
+	  continue;
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc
new file mode 100644
index 0000000..f0ad2a3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc
@@ -0,0 +1,42 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <assert.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <vector>
+#include <list>
+#if HAVE_IO 
+#include <stdio.h>
+#endif
+#define NUMBER 500
+#include <stdlib.h>
+typedef std::pair<int, int> my_type_t;
+
+long
+valid_pairs(std::vector< my_type_t > my_list) 
+{
+  _Cilk_for (int ii = 0; ii < my_list.size(); ii++) 
+    {
+#if HAVE_IO
+    fprintf(stderr, "my_list index: %d, size: %zu.\n", ii, my_list.size());
+#endif
+      if (ii < 0 || ii >= my_list.size())
+	abort (); 
+    }
+  return 0;
+}
+
+int main(int argc, char **argv) 
+{
+  std::vector<my_type_t> my_list;
+
+  for (int ii = 0; ii < NUMBER; ii++) 
+    my_list.push_back(my_type_t(ii, ii));
+  long res = valid_pairs(my_list);
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_grainsize.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_grainsize.cc
new file mode 100644
index 0000000..7d54828
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_grainsize.cc
@@ -0,0 +1,77 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int x = 5;
+int q = 25;
+int z = 2;
+
+int square (int b)
+{
+  return (b*b);
+}
+
+template<class T>
+int templated_func (T a, T b, T c)
+{
+  T Array[10];
+#pragma cilk grainsize = a
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = a;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != a)
+      __builtin_abort ();
+
+#pragma cilk grainsize = square ((int) (b/c))
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = b;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != b)
+      __builtin_abort ();
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = c;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != c)
+      __builtin_abort ();
+
+  return 0;
+}
+
+ 
+
+int main (void)
+{
+  int Array[10];
+#pragma cilk grainsize = 5
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != 5)
+      __builtin_abort ();
+
+
+#pragma cilk grainsize = x
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 10;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != 10)
+      __builtin_abort ();
+
+#pragma cilk grainsize = square (z)
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 15;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != 15)
+      __builtin_abort ();
+
+  int r = 5, s=10, t =15;
+  return templated_func (r, s, t);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_p_errors.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_p_errors.cc
new file mode 100644
index 0000000..4c69712
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_p_errors.cc
@@ -0,0 +1,52 @@
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+
+int main (void)
+{
+  int a, iii = 0;
+  _Cilk_for (; iii < 10; iii++) /* { dg-error "expected induction variable" } */
+    a = 5;
+
+  _Cilk_for (iii = 0; iii < 10; iii++) /* { dg-error " must declare variable" } */
+    a = 5;
+
+  _Cilk_for (int qq = 0, jj = 0; qq < 10; qq++) /* { dg-error " initializer cannot have multiple variable declarations" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0, int jj = 0; ii < 10; ii++) /* { dg-error " initializer cannot have multiple variable declarations" } */
+    a = 5;
+
+  _Cilk_for (int rr = 0; ; rr++) /* { dg-error "missing condition" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii = 5; ii++) /* { dg-error "invalid controlling predicate" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii == 5; ii++) /* { dg-error "invalid controlling predicate" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii < 10;) /* { dg-error "missing increment" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii ) /* { dg-error "invalid increment expression" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      a = 5;
+      if (ii == 5)
+	break; /* { dg-error "break statement used in _Cilk_for loop body" } */
+    }
+
+#pragma cilk grainsize 5 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    a = 5;
+
+#pragma Silk grainsize = 5 /* { dg-warning "ignoring #pragma Silk grainsize" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    a = 5;
+#pragma cilk grainsiz = 5 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    a = 5;
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_t_errors.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_t_errors.cc
new file mode 100644
index 0000000..b597764
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_t_errors.cc
@@ -0,0 +1,30 @@
+/* { dg-options "-fcilkplus" } */
+
+#include <setjmp.h>
+int main (void)
+{
+  int a, iii = 0;
+
+  _Cilk_for (volatile int ii = 0; ii < 10; ii++) /* { dg-error "iteration variable cannot be volatile" } */
+    a = 5;
+
+  _Cilk_for (static int ii = 0; ii < 10; ii++) /* { dg-error "storage class is not allowed" } */
+    a = 5;
+  _Cilk_for (register int ii = 0; ii < 10; ii++) /* { dg-error "storage class is not allowed" } */
+    a = 5;
+
+  _Cilk_for (extern int ii = 0; ii < 10; ii++) /* { dg-error "storage class is not allowed" } */
+    a = 5;
+
+  _Cilk_for (float ii = 0.0; ii < 10.0; ii += 0.5) /* { dg-error "induction variable must be of integral record or pointer type" } */
+    a = 5;
+
+  jmp_buf env;
+  _Cilk_for (int ii = 0; ii < 10; ii++) 
+    {
+      a = 5;
+      setjmp (env); /* { dg-error "calls to setjmp are not allowed within" } */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/explicit_ctor.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/explicit_ctor.cc
new file mode 100644
index 0000000..89f6403
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/explicit_ctor.cc
@@ -0,0 +1,27 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+struct BruceBoxleitner {
+    int m;
+    BruceBoxleitner (int n = 0) : m(n) { }
+    BruceBoxleitner operator--() { --m; return *this; }
+};
+
+int operator- (BruceBoxleitner a, BruceBoxleitner b) { return a.m - b.m; }
+
+struct BruceLee {
+    int m;
+    explicit BruceLee (int n) : m(n) { }
+};
+
+bool operator> (BruceBoxleitner a, BruceLee b) { return a.m > b.m; }
+int operator- (BruceBoxleitner a, BruceLee b) { return a.m - b.m; }
+
+int main () {
+    _Cilk_for (BruceBoxleitner i = 10; i > BruceLee(0); --i)
+      ;
+    return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/label_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/label_test.cc
new file mode 100644
index 0000000..495e9b4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/label_test.cc
@@ -0,0 +1,26 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int main(void)
+{
+  int jj = 0;
+  int total = 0;
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if ((ii % 2) == 0)
+	goto hello_label;
+      else
+	goto world_label;
+
+hello_label:
+     total++;
+world_label:
+     total++;
+    }
+  if (total != 15)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/no-opp-overload-error.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/no-opp-overload-error.cc
new file mode 100644
index 0000000..582ef60
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/no-opp-overload-error.cc
@@ -0,0 +1,88 @@
+/* { dg-options "-fcilkplus" } */
+
+
+#define NUMBER_OF_ELEMENTS 10
+
+#include <cstdlib>
+
+class my_class {
+private:
+  int value;
+public:
+
+  my_class ();
+  my_class (const my_class &val);
+  my_class (my_class &val);
+  my_class (int val);
+  ~my_class ();
+  int getValue();
+  my_class &operator= (my_class &new_value);
+  my_class &operator= (int x);
+  my_class &operator+= (int val)
+  {
+    value += val;
+    return *this;
+  }
+  bool operator< (const my_class &val)
+  {
+    return (value < val.value);
+  }
+};
+
+
+my_class::my_class ()
+{
+  value = 0;
+}
+
+my_class::my_class(int val)
+{
+  value = val;
+}
+
+my_class::my_class (my_class &val)
+{
+  value = val.value;
+}
+
+my_class::my_class (const my_class &val)
+{
+  value = val.value;
+}
+
+my_class::~my_class ()
+{
+  value = -1;
+}
+
+int my_class::getValue ()
+{
+  return value;
+}
+
+my_class & my_class::operator= (my_class &new_value)
+{
+  value = new_value.value;
+  return *this;
+}
+
+my_class &my_class::operator= (int x)
+{
+  value = x;
+  return *this;
+}
+
+int main (void)
+{
+  int n, *array_parallel;
+  my_class length (NUMBER_OF_ELEMENTS);
+    n = NUMBER_OF_ELEMENTS;
+  
+  array_parallel = new int[NUMBER_OF_ELEMENTS];
+  _Cilk_for (my_class ii (0); ii < length; ii += 1) { /* { dg-error " No operator-" } */
+      int x = ii.getValue();
+    array_parallel [x] = x * 2;
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-one.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-one.cc
new file mode 100644
index 0000000..1326308
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-one.cc
@@ -0,0 +1,59 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#if HAVE_IO
+#include <cstdio>
+#endif
+#define TEST 1
+
+
+#define ITER 300
+
+int n_errors;
+#if TEST
+void test (int *array, int n, int val) {
+#if HAVE_IO
+    for (int i = 0; i < n; i++)
+      std::printf("array[%3d] = %2d\n", i, array[i]);
+#endif
+    for (int i = 0; i < n; ++i) {
+        if (array[i] != val) {
+           __builtin_abort (); 
+        }
+    }
+}
+#endif
+ 
+
+int main () {
+    int array[ITER];
+  
+    for (int ii = 0; ii < ITER; ii++)
+      array[ii] = 9;
+    _Cilk_for (int *j = (array); j < array + ITER; j += 1)  {
+       *j = 6; 
+    }
+#if TEST
+    test(array, ITER, 6);
+#endif
+
+    _Cilk_for (int *i = array; i < array + ITER; i += 1) {
+        *i = 1;
+    }
+
+#if TEST
+    test(array, ITER, 1);
+#endif
+
+    _Cilk_for (int *k = array+ITER-1; k >= array; k -= 1) {
+        *k = 8;
+    }
+#if TEST
+    test(array, ITER, 8);
+#endif
+  
+    return 0;
+
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-test.cc
new file mode 100644
index 0000000..0ca588d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-test.cc
@@ -0,0 +1,111 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#define NUMBER_OF_ELEMENTS 10
+
+#include <cstdlib>
+
+#if HAVE_IO
+#include <cstdio>
+#endif
+
+class my_class {
+private:
+  int value;
+public:
+
+  my_class ();
+  my_class (const my_class &val);
+  my_class (my_class &val);
+  my_class (int val);
+  ~my_class ();
+  int getValue();
+  my_class &operator= (my_class &new_value);
+  my_class &operator+= (int val)
+  {
+    value += val;
+    return *this;
+  }
+  bool operator< (const my_class &val)
+  {
+    return (value < val.value);
+  }
+};
+
+
+my_class::my_class ()
+{
+  value = 0;
+}
+
+my_class::my_class(int val)
+{
+  value = val;
+}
+
+my_class::my_class (my_class &val)
+{
+  value = val.value;
+}
+
+my_class::my_class (const my_class &val)
+{
+  value = val.value;
+}
+
+my_class::~my_class ()
+{
+  value = -1;
+}
+
+int my_class::getValue ()
+{
+  return value;
+}
+
+my_class & my_class::operator= (my_class &new_value)
+{
+  value = new_value.value;
+  return *this;
+}
+
+int operator- (my_class x, my_class y)
+{
+  int val_x = x.getValue ();
+  int val_y = y.getValue ();
+  return (val_x - val_y);
+}
+
+
+int main (void)
+{
+  int n, *array_parallel, *array_serial;
+  my_class length (NUMBER_OF_ELEMENTS);
+    n = NUMBER_OF_ELEMENTS;
+  
+  array_parallel = new int[NUMBER_OF_ELEMENTS];
+  array_serial = new int[NUMBER_OF_ELEMENTS];
+
+  _Cilk_for (my_class ii (0); ii < length; ii += 1) {
+#if HAVE_IO
+    std::printf("ii.getValue() = %d\n", ii.getValue ());
+#endif
+    array_parallel [ii.getValue ()] = ii.getValue() * 2;
+  }
+
+  for (my_class ii (0); ii < length; ii += 1)
+    array_serial [ii.getValue ()] = ii.getValue () * 2;
+  
+  for (int ii = 0; ii < NUMBER_OF_ELEMENTS; ii++)
+    if (array_serial[ii] != array_parallel[ii]) {
+#if HAVE_IO
+      std::printf("array_serial[%3d] = %6d\tarray_parallel[%3d] = %6d\n", ii,
+		  array_serial[ii], ii, array_parallel[ii]);
+#endif
+      __builtin_abort ();
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
new file mode 100644
index 0000000..e4f2ee5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
@@ -0,0 +1,58 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+int main(void)
+{
+vector <int> array;
+vector <int> array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back (ii);
+}
+#endif
+#if 1
+_Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end(); 
+	   iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+#endif
+for (vector<int>::iterator iter2 = array_serial.begin(); 
+     iter2 != array_serial.end(); iter2++)
+{
+   if (*iter2  == 6) 
+     *iter2 = 13;
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter3 = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+
+while (iter3 != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter3 != *iter_serial)
+    abort ();
+  iter3++;
+  iter_serial++;
+}
+
+return 0;
+}   
+
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
new file mode 100644
index 0000000..3e350a1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
@@ -0,0 +1,50 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <iostream>
+#include <cstdio>
+#include <cstdlib>
+#include <vector>
+#include <algorithm>
+#include <list>
+
+using namespace std;
+
+
+int main(int argc, char **argv)
+{
+  vector <int> number_list, number_list_serial;
+  int new_number = 0;
+  int no_elements = 0;
+  
+  if (argc != 2)
+  {
+    no_elements = 10000;
+  }
+
+
+  number_list.clear();
+  number_list_serial.clear();
+  for (int ii = 0; ii < no_elements; ii++)
+  {
+    number_list.push_back(new_number);
+    number_list_serial.push_back(new_number);
+  }
+
+  _Cilk_for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list[jj] = jj + no_elements;
+  }
+  for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list_serial[jj] = jj + no_elements;
+  }
+
+  for (int jj = 0; jj < no_elements; jj++)
+    if (number_list_serial[jj] != number_list[jj])
+      abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
index 27412e8..ff5ea33 100644
--- a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
+++ b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
@@ -64,12 +64,10 @@ dg-finish
 
 dg-init
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O0 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O1 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O2 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O3 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -O0 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -O1 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -O2 -ftree-vectorize -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g -O3 -fcilkplus" " "
@@ -77,7 +75,6 @@ dg-finish
 
 dg-init
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O0 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O1 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O2 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -fcilkplus" " "

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-15 21:45 [PATCH] _Cilk_for for C and C++ Iyer, Balaji V
  2013-11-16  1:38 ` Aldy Hernandez
@ 2013-11-20  8:05 ` Aldy Hernandez
  2013-11-27 18:37 ` Jason Merrill
  2 siblings, 0 replies; 42+ messages in thread
From: Aldy Hernandez @ 2013-11-20  8:05 UTC (permalink / raw)
  To: Iyer, Balaji V
  Cc: gcc-patches, Jeff Law, Jason Merrill (jason@redhat.com), rth


> One small thing that I have not done that Jakub and several other
> have asked me before is that, there are no tests in c-c++-common for
> _Cilk_for. The reason being that the syntax between C and C++
> implementations are different. In C++, the induction variable must be
> defined in the initializer (e.g. it should start wth _Cilk_for (int
> ii = 0....)). In C, this is not allowed (e.g. it should start as
> _Cilk_for (ii = 0; ii < 10; ii++)).

For pragma simd what I did was put the tests in c-c++-common and pass 
"-std=c99" to the C tests.  That should allow declaration at initialization.

Can you do this?

Aldy

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-19  1:11   ` Iyer, Balaji V
@ 2013-11-22 19:45     ` Jason Merrill
  2013-11-26  8:16       ` Iyer, Balaji V
  2013-11-27 23:55     ` Jeff Law
  1 sibling, 1 reply; 42+ messages in thread
From: Jason Merrill @ 2013-11-22 19:45 UTC (permalink / raw)
  To: Iyer, Balaji V, Aldy Hernandez; +Cc: gcc-patches, Jeff Law, rth

On 11/18/2013 04:50 PM, Iyer, Balaji V wrote:
> +  int flags = LOOKUP_PROTECT | LOOKUP_ONLYCONVERTING;

Why not LOOKUP_NORMAL? LOOKUP_ONLYCONVERTING isn't relevant in this context.

> +  tree exp = build_new_op (EXPR_LOCATION (op1), code, flags, op0, op1,
> +			   NULL_TREE, NULL, 0);

Use tf_none instead of 0.

> +  if (exp == error_mark_node)
> +    exp = build_x_modify_expr (EXPR_LOCATION (op1), op0, code, op1, tf_none);
> +  if (exp && exp != error_mark_node)
> +    return exp;

This doesn't make sense to me.  build_x_modify_expr takes codes like 
PLUS_EXPR and then does an assignment afterward; we don't want to 
quietly do += just because there's some error with the evaluation of the 
+ operation.  What is this code trying to do?

> +/* Handler for iterator to compute the loop variable.  ADD_OP indicates
> +   whether we need a '+' or '-' operation. LOW indicates the starting point
> +   and LOOP_VAR is the induction variable.  Returns an expression (or a
> +   STATEMENT_LIST of expressions).  If it is unable to find the appropriate
> +   iteration, then it returns an error mark node and its parent will set
> +   the loop as invalid.  */

This doesn't explain what VAR2 is.  And it seems like you're also using 
LOW as the increment?

> +      tree new_stmt = build_x_modify_expr (loc, new_var, INIT_EXPR,
> +					   build_zero_cst (TREE_TYPE (new_var)),
> +					   tf_warning_or_error);
> +      if (new_stmt == error_mark_node)
> +	return error_mark_node;
> +      append_to_statement_list (new_stmt, &exp);
> +      new_stmt = build_x_modify_expr (loc, new_var, NOP_EXPR, low,
> +				      tf_warning_or_error);

Why assign 0 if you're going to immediately assign low afterwards?

> +  /* We have to manually create this loop for two reasons:
> +     a. We need to have access to continue and start label since we need
> +        to resolve continue and breaks by hand.

Why do you need to resolve them by hand?  It looks like break isn't even 
allowed.

> +     b. C++ doesn't provide a c_finish_loop function like C does.  */

Why is that important?

>    sk_for,	     /* The scope of the variable declared in a
>  			for-init-statement.  */
> +  sk_cilk_for,       /* The scope of the variable declared in _Cilk_for init
> +			statement.  */

How is this different from a normal for-init-statement?  Nothing seems 
to use it.

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2013-11-22 19:45     ` Jason Merrill
@ 2013-11-26  8:16       ` Iyer, Balaji V
  2013-11-27 17:59         ` Jason Merrill
  0 siblings, 1 reply; 42+ messages in thread
From: Iyer, Balaji V @ 2013-11-26  8:16 UTC (permalink / raw)
  To: Jason Merrill, Aldy Hernandez; +Cc: gcc-patches, Jeff Law, rth

[-- Attachment #1: Type: text/plain, Size: 7232 bytes --]

Hi Jason,
	I am attaching a fixed patch. I have resolved all the issues you have mentioned below and I have added answers to your questions below. I have not regenerated the C patch since nothing has changed on it.

Here are the ChangeLog entries:
gcc/cp/ChangeLog.
2013-11-25  Balaji V. Iyer  <balaji.v.iyer@intel.com>

        * cp-cilkplus.c: Added cgraph.h, gimple.h and gimplify.h.
        (callable): New function.
        (calc_count_up_count_down): Likewise.
        (compute_loop_var_cp_iter_hdl): Likewise.
        (cp_create_cilk_for_body): Likewise.
        (create_cilk_for_nested_fn): Likewise.
        (gimplify_cilk_for_1): Likewise.
        (cp_extract_cilk_for_fields): Likewise.
        (cp_gimplify_cilk_for): Likewise.
        * cp-gimplify.c (genericize_cilk_for_stmt): Likewise.
        (cp_genericize_r): Added a check for CILK_FOR_STMT.
        * cp-objcp-common.h (LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR): New
        #define.
        * cp-tree.h (begin_cilk_for_stmt): New prototype.
        (finish_cilk_for_stmt): Likewise.
        (finish_cilk_for_init_stmt): Likewise.
        (cp_gimplify_cilk_for): Likewise.
        * parser.c (cp_parser_cilk_grainsize): New function and prototype.
        (cp_parser_init_declarator): Added a new parameter to hold the
        initial value.
        (cp_parser_statement): Added RID_CILK_FOR case.
        (cp_parser_iteration_statement): Likewise.
        (cp_parser_jump_statement): Added IN_CILK_FOR_STMT case (twice).
        (cp_parser_pragma): Added PRAGMA_CILK_GRAINSIZE case.
        (cp_parser_cilk_for_init_statement): New function.
        (cp_parser_cilk_for): Renamed a parameter and added support for
        parsing _Cilk_for loops that are part of Cilk keywords.
        * parser.h (IN_CILK_FOR_STMT): New #define.
        * pt.c (tsubst_expr): Added CILK_FOR_STMT case.
        * semantics.c (begin_for_scope): Added "_Cilk_for statement" in the
        header comment.
        (finish_for_expr): Added support for CILK_FOR_STMT to use this
        function.
        (finish_cilk_for_cond): Added support for processing templates.
        (begin_cilk_for_stmt): New function.
        (finish_cilk_for_init_stmt): Likewise.
        (finish_clk_for_stmt): Likewise.

gcc/testsuite/ChangeLog.
2013-11-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

        * g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc: New test.
        * g++.dg/cilk-plus/CK/cilk-for-tplt.cc: Likewise.
        * g++.dg/cilk-plus/CK/cilk-for.cc: Likewise.
        * g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc: Likewise.
        * g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc: Likewise.
        * g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc: Likewise.
        * g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc: Likewise.
        * g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc: Likewise.
        * g++.dg/cilk-plus/CK/cilk_for_grainsize.cc: Likewise.
        * g++.dg/cilk-plus/CK/cilk_for_p_errors.cc: Likewise.
        * g++.dg/cilk-plus/CK/cilk_for_t_errors.cc: Likewise.
        * g++.dg/cilk-plus/CK/explicit_ctor.cc: Likewise.
        * g++.dg/cilk-plus/CK/label_test.cc: Likewise.
        * g++.dg/cilk-plus/CK/no-opp-overload-error.cc: Likewise.
        * g++.dg/cilk-plus/CK/plus-equal-one.cc: Likewise.
        * g++.dg/cilk-plus/CK/plus-equal-test.cc: Likewise.
        * g++.dg/cilk-plus/CK/stl_iter.cc: Likewise.
        * g++.dg/cilk-plus/CK/stl_test.cc: Likewise.
        * g++.dg/cilk-plus/cilk-plus.exp: Added support to call _Cilk_for
        testcodes.

Thanks,

Balaji V. Iyer.

> -----Original Message-----
> From: Jason Merrill [mailto:jason@redhat.com]
> Sent: Friday, November 22, 2013 11:46 AM
> To: Iyer, Balaji V; Aldy Hernandez
> Cc: gcc-patches@gcc.gnu.org; Jeff Law; rth@redhat.com
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On 11/18/2013 04:50 PM, Iyer, Balaji V wrote:
> > +  int flags = LOOKUP_PROTECT | LOOKUP_ONLYCONVERTING;
> 
> Why not LOOKUP_NORMAL? LOOKUP_ONLYCONVERTING isn't relevant in
> this context.
> 

Fixed. I used LOOKUP_NORMAL

> > +  tree exp = build_new_op (EXPR_LOCATION (op1), code, flags, op0, op1,
> > +			   NULL_TREE, NULL, 0);
> 
> Use tf_none instead of 0.
> 

Fixed.

> > +  if (exp == error_mark_node)
> > +    exp = build_x_modify_expr (EXPR_LOCATION (op1), op0, code, op1,
> > + tf_none);  if (exp && exp != error_mark_node)
> > +    return exp;
> 
> This doesn't make sense to me.  build_x_modify_expr takes codes like
> PLUS_EXPR and then does an assignment afterward; we don't want to
> quietly do += just because there's some error with the evaluation of the
> + operation.  What is this code trying to do?
> 

Yes, that was a mistake on my side. The exp cannot be error_mark_code at this point  I am sorry. It is removed.

> > +/* Handler for iterator to compute the loop variable.  ADD_OP indicates
> > +   whether we need a '+' or '-' operation. LOW indicates the starting point
> > +   and LOOP_VAR is the induction variable.  Returns an expression (or a
> > +   STATEMENT_LIST of expressions).  If it is unable to find the appropriate
> > +   iteration, then it returns an error mark node and its parent will set
> > +   the loop as invalid.  */
> 
> This doesn't explain what VAR2 is.  And it seems like you're also using LOW as
> the increment?
> 

var2 is the copy of the induction variable in _Cilk_for but its context is the cilk_for nested function.

> > +      tree new_stmt = build_x_modify_expr (loc, new_var, INIT_EXPR,
> > +					   build_zero_cst (TREE_TYPE
> (new_var)),
> > +					   tf_warning_or_error);
> > +      if (new_stmt == error_mark_node)
> > +	return error_mark_node;
> > +      append_to_statement_list (new_stmt, &exp);
> > +      new_stmt = build_x_modify_expr (loc, new_var, NOP_EXPR, low,
> > +				      tf_warning_or_error);
> 
> Why assign 0 if you're going to immediately assign low afterwards?
> 

This part is fixed as I mentioned above.

> > +  /* We have to manually create this loop for two reasons:
> > +     a. We need to have access to continue and start label since we need
> > +        to resolve continue and breaks by hand.
> 
> Why do you need to resolve them by hand?  It looks like break isn't even
> allowed.
> 

You are correct, I don't need to do them. I just need to emit a FOR_STMT with the body inside it and then when I do a cp_genericize, it will automatically resolve it. I fixed it accordingly.

> > +     b. C++ doesn't provide a c_finish_loop function like C does.  */
> 
> Why is that important?
> 

Please see my note above. By the way, I hope you didn't read my above comment as knocking the C++ implementation. I just gave a detailed explanation as to why I did the loop-creation by hand. I have removed that now since it is no longer applicable

> >    sk_for,	     /* The scope of the variable declared in a
> >  			for-init-statement.  */
> > +  sk_cilk_for,       /* The scope of the variable declared in _Cilk_for init
> > +			statement.  */
> 
> How is this different from a normal for-init-statement?  Nothing seems to
> use it.
> 

Yep. Removed.

> Jason


[-- Attachment #2: diff.txt --]
[-- Type: text/plain, Size: 58231 bytes --]

diff --git a/gcc/cp/cp-cilkplus.c b/gcc/cp/cp-cilkplus.c
index 414f71e..1c5cfb2
--- a/gcc/cp/cp-cilkplus.c
+++ b/gcc/cp/cp-cilkplus.c
@@ -28,7 +28,10 @@
 #include "tree-iterator.h"
 #include "tree-inline.h"  /* for copy_tree_body_r.  */
 #include "ggc.h"
+#include "cgraph.h"
+#include "gimple.h"
 #include "cilk.h"
+#include "gimplify.h"
 
 /* Callback for cp_walk_tree to validate the body of a pragma simd loop
    or _cilk_for loop.
@@ -180,3 +183,431 @@ cp_cilk_install_body_wframe_cleanup (tree fndecl, tree orig_body)
 			    &list);
 }
 
+/* Returns an overloaded function that does operation based on CODE using
+   OP0 and OP1.  If CRY is set to true, then the function complains when
+   it is unable to find an overloaded operator.  */
+
+static tree
+callable (location_t loc, enum tree_code code, tree op0, tree op1, bool cry)
+{
+  vec<tree, va_gc> *op1_vec = make_tree_vector_single (op1);
+  if (code == INIT_EXPR)
+    return build_special_member_call (NULL_TREE, complete_ctor_identifier,
+				      &op1_vec,
+				      TYPE_MAIN_VARIANT (TREE_TYPE (op1)), 0,
+				      cry);
+    
+  if (code == PSEUDO_DTOR_EXPR)
+    return build_special_member_call (NULL_TREE, complete_dtor_identifier,
+				      &op1_vec,
+				      TYPE_MAIN_VARIANT (TREE_TYPE (op1)), 0,
+				      cry);
+
+  int flags = LOOKUP_NORMAL;
+  tree exp = build_new_op (EXPR_LOCATION (op1), code, flags, op0, op1,
+			   NULL_TREE, NULL, tf_none);
+  if (exp == error_mark_node)
+    exp = build_x_modify_expr (EXPR_LOCATION (op1), op0, code, op1, tf_none);
+  if (exp && exp != error_mark_node)
+    return exp;
+
+  const char *op = operator_name_info[(int) code].name;
+  const char *explain = cry ? "" : "accessible, unambiguous";
+  if (op1) 
+    error_at (loc, "No%s operator%s(%T,%T) for _Cilk_for loop", explain, op, 
+	      TREE_TYPE (op0), TREE_TYPE (op1)); 
+  else 
+    error_at (loc, "No%s operator%s(%T,%T) for _Cilk_for loop", explain, op, 
+	      TREE_TYPE (op0), TREE_TYPE (op0));
+  return NULL_TREE;
+}
+
+/* Calculates the COUNT_UP and/or COUNT_DOWN values for a _Cilk_for loop using
+   its characteristics stored in *CFD.  */
+
+static void
+calc_count_up_count_down (struct cilk_for_desc *cfd, tree *count_up,
+			  tree *count_down)
+{
+  /* Reasoning for high and low variables can be found in
+     cilk_compute_loop_count in c-family/cilk.c.  */
+  tree high = cfd->end_var ? cfd->end_var : cfd->end_expr;
+  tree low = cfd->lower_bound ? cfd->lower_bound : cfd->var;
+
+  /* When these are invalid, we flag them in cilk_compute_loop_var.  This
+     condition is a bit rare.  */
+  if (high == error_mark_node || low == error_mark_node)
+    return;
+  
+  /* Only call this function if we are using an iterator.  */
+  gcc_assert (cfd->iterator);
+  
+  if (TREE_CODE (high) == TARGET_EXPR)
+    high = TARGET_EXPR_INITIAL (high);
+  if (TREE_CODE (low) == TARGET_EXPR)
+    low = TARGET_EXPR_INITIAL (low);
+  
+  if (TREE_CODE (low) == TREE_LIST)
+    low = TREE_VALUE (low);
+  high = cilk_tree_operand_noconv (high);
+  if (cfd->direction >= 0)
+    {
+      *count_up = build_x_binary_op (cfd->loc, MINUS_EXPR, high,
+				     TREE_CODE (high), low, TREE_CODE (low),
+				     NULL, tf_warning_or_error);
+      /* We should have already failed if this operator is not callable.  */
+      gcc_assert (*count_up != error_mark_node);
+    }
+  else
+    {
+      *count_down = build_x_binary_op (cfd->loc, MINUS_EXPR, low,
+				       TREE_CODE (low), high, TREE_CODE (high),
+				       NULL, tf_warning_or_error);
+      /* ...same reasoning as count up for the assert below.  */
+      gcc_assert (*count_down != error_mark_node);
+    }
+}
+
+/* Handler for iterator to compute the loop variable.  ADD_OP indicates
+   whether we need a '+' or '-' operation. LOW indicates the starting point
+   and LOOP_VAR is the induction variable.  This functin returns an 
+   INIT_EXPR.  */
+
+static tree
+compute_loop_var_cp_iter_hdl (location_t loc, enum tree_code add_op,
+			      tree low, tree loop_var, tree var2)
+{
+  tree exp = build_new_op (loc, add_op, 0, low, loop_var, NULL_TREE, 0,
+			   tf_none);
+  gcc_assert (exp != error_mark_node);
+  exp = cp_build_modify_expr (var2, INIT_EXPR, exp, tf_warning_or_error);
+  return exp;
+}
+
+/* Returns the body of the nested function for a _Cilk_for using the loop's
+   characteristic information from CFD.  The returned tree will be a
+   STATEMENT LIST.  */
+
+static tree
+cp_create_cilk_for_body (struct cilk_for_desc *cfd)
+{
+  push_function_context ();
+  declare_cilk_for_parms (cfd);
+  cfd->wd.fntype = build_function_type (void_type_node, cfd->wd.argtypes);
+
+  tree fndecl = cilk_create_cilk_helper_decl (&cfd->wd);
+  fndecl = build_lang_decl (FUNCTION_DECL, DECL_NAME (fndecl), cfd->wd.fntype);
+  if (cfd->nested_ok)
+    DECL_CONTEXT (fndecl) = current_function_decl;
+  else
+    DECL_CONTEXT (fndecl) = DECL_CONTEXT (current_function_decl);
+
+  tree outer = current_function_decl;
+  SET_DECL_LANGUAGE (fndecl, lang_c);
+  start_preparsed_function (fndecl, NULL_TREE, SF_PRE_PARSED);
+
+  declare_cilk_for_vars (cfd, fndecl);
+  
+  tree lower_bound = cfd->lower_bound;
+  struct gimplify_ctx gctx;
+
+  tree body = begin_compound_stmt (BCS_FN_BODY);
+  push_gimplify_context (&gctx);
+
+  gimple_add_tmp_var (cfd->var2);
+
+  /* Get the lower bound into a variable unless it is a constant or a
+     non-copyable value.  If non-copyable value, then reference value from
+     the outer frame.  */
+  if (!lower_bound)
+    {
+      lower_bound = cfd->var;
+      tree hack = build_decl (cfd->loc, VAR_DECL, NULL_TREE,
+			      TREE_TYPE (lower_bound));
+      DECL_CONTEXT (hack) = DECL_CONTEXT (lower_bound);
+      *pointer_map_insert (cfd->wd.decl_map, hack) = lower_bound;
+      lower_bound = hack;
+    }
+  tree cast_max_expr, count_type, pre, loop_var;
+  if (INTEGRAL_TYPE_P (cfd->var_type))
+    {
+      loop_var = create_tmp_var (cfd->var_type, NULL);
+      count_type = cfd->var_type;
+      tree cvt_expr = cp_fold_convert (cfd->var_type, cfd->min_parm);
+      pre = build_x_modify_expr (cfd->loc, loop_var, NOP_EXPR, cvt_expr,
+				 tf_warning_or_error);
+      cast_max_expr = cp_fold_convert (count_type, cfd->max_parm);
+    }
+  else
+    {
+      loop_var = create_tmp_var (TREE_TYPE (cfd->min_parm), NULL);
+      count_type = cfd->count_type;
+      pre = fold_build2 (INIT_EXPR, void_type_node, loop_var, cfd->min_parm);
+      cast_max_expr = cfd->max_parm;
+    }
+
+  tree loop_body = alloc_stmt_list ();
+  
+  /* Concat. the control variable initialization with the loop body.
+     Do not call gimplify_and_add to append to list because we need
+     to wrap the entire list in a cleanup point expr to delay destruction
+     of the control variable to the end of the loop if it is an iterator.  */
+  tree loop_end_comp = cilk_compute_loop_var (cfd, loop_var, lower_bound,
+					      compute_loop_var_cp_iter_hdl);
+  if (loop_end_comp == error_mark_node)
+    {
+      cfd->invalid = true;
+      return error_mark_node;
+    }
+  append_to_statement_list (loop_end_comp, &loop_body);
+  tree cleanup = cxx_maybe_build_cleanup (cfd->var2, tf_none);
+  if (cleanup)
+    {
+      append_to_statement_list (cfd->body, &loop_body);
+      append_to_statement_list (cleanup, &loop_body);
+    }
+  else
+    append_to_statement_list (cfd->body, &loop_body);
+
+  loop_body = fold_build_cleanup_point_expr (void_type_node, loop_body);
+  DECL_SEEN_IN_BIND_EXPR_P (cfd->var2) = 1;
+
+  cfd->wd.context = outer;
+  bool throws = flag_exceptions ? cp_function_chain->can_throw : false;
+  cilk_outline_body (fndecl, &loop_body, &cfd->wd, &throws);
+  cp_function_chain->can_throw = throws;
+
+  tree loop_cond = fold_build2 (LT_EXPR, boolean_type_node, loop_var,
+				cast_max_expr);
+  tree mod_expr = fold_build2 (MODIFY_EXPR, void_type_node, loop_var,
+				build2 (PLUS_EXPR, count_type, loop_var,
+					build_one_cst (count_type)));
+
+  /* this for loop will be like this (assuming start < end):
+     for (ii = start; ii < end; ii++)
+       <_Cilk_for body>  */
+  add_stmt (build5 (FOR_STMT, void_type_node, pre, loop_cond, mod_expr,
+		    loop_body, NULL_TREE));
+
+  DECL_INITIAL (fndecl) = make_node (BLOCK);
+  TREE_USED (DECL_INITIAL (fndecl)) = 1;
+  BLOCK_VARS (DECL_INITIAL (fndecl)) = loop_var;
+  TREE_CHAIN (loop_var) = cfd->var2;
+
+  body = build3 (BIND_EXPR, void_type_node, loop_var, body,
+		 DECL_INITIAL (fndecl));
+  DECL_CONTEXT (cfd->var2) = fndecl;
+  pop_gimplify_context (0);
+
+  finish_function_body (body);
+  
+  /* A nested function canot be expanded or deferred until its parent is done.
+     So, don't call expand_or_defer_fn here.  A non-nested function must be
+     done here.  */
+  if (!cfd->nested_ok)
+    expand_or_defer_fn (fndecl);
+  
+  pop_function_context ();
+  return fndecl;
+}
+
+/* Creates a nested function for the _Cilk_for statement using its information
+   in CFD.  PRE_P is the preceeding gimple trees function.  */
+
+static tree
+create_cilk_for_nested_fn (struct cilk_for_desc *cfd, gimple_seq *pre_p)
+{
+  tree var = cfd->var;
+  DECL_CONTEXT (var) = current_function_decl;
+
+  if (POINTER_TYPE_P (TREE_TYPE (var)))
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_WRITE);
+  else
+    cilk_extract_free_variables (cfd->lower_bound, &cfd->wd, ADD_READ);
+
+  tree incr = cfd->incr;
+
+  /* If the loop increment is not an integer constant and is not a DECL,
+     copy it to a temporary.  if it is modified during the loop the behavior
+     is undefined.  Races could be avoided by copying it to a temporary
+     variable.  */
+  if (TREE_CODE (incr) != INTEGER_CST && !DECL_P (incr))
+    {
+      incr = get_formal_tmp_var (incr, pre_p);
+      cfd->incr = incr;
+    }
+
+  if (DECL_P (incr) && !TREE_STATIC (incr) && !DECL_EXTERNAL (incr))
+    *pointer_map_insert (cfd->wd.decl_map, incr) = incr;
+
+  /* Map the loop variable to integer_minus_one_node if we won't really be
+     passing it into hte loop body.  Otherwise map to integer_zero_node.  */
+  *pointer_map_insert (cfd->wd.decl_map, var) =
+    (void *) (cfd->lower_bound ? integer_minus_one_node : integer_zero_node);
+  cilk_extract_free_variables (cfd->body, &cfd->wd, ADD_READ);
+
+  tree fn = cp_create_cilk_for_body (cfd);
+
+  /* One of the reasons why FN is error_mark_node is because the function
+     couldn't find the appropriate overloaded operation.  */
+  if (fn == error_mark_node)
+    return error_mark_node;
+
+  DECL_UNINLINABLE (fn) = 1;
+  DECL_STATIC_CHAIN (fn) = 1;
+
+  current_function_decl = fn;
+  /* Genericize the _Cilk_for body, mainly split up the _Cilk_for body and
+     the for-loop we inserted.  */
+  cp_genericize (fn);
+  return fn;
+}
+
+/* Helper function to gimplify a CILK_FOR_STMT.  CFD holds all the values
+   extracted a CILK_FOR_STMT and *PRE_P is the preceeding sequence.  */
+
+static void
+gimplify_cilk_for_1 (struct cilk_for_desc cfd, gimple_seq *pre_p)
+{
+  bool order_variable = false;
+  tree parent_function = current_function_decl;
+  
+  if (TREE_SIDE_EFFECTS (cfd.end_expr))
+    {
+      enum tree_code ecode = TREE_CODE (cfd.end_expr);
+      if (ecode == INIT_EXPR || ecode == MODIFY_EXPR)
+	cfd.end_var = TREE_OPERAND (cfd.end_expr, 0);
+      else if (ecode == TARGET_EXPR)
+	{
+	  cfd.end_var = TARGET_EXPR_INITIAL (cfd.end_expr);
+	  if (TREE_CODE (cfd.end_var) == AGGR_INIT_EXPR)
+	    cfd.end_var = TARGET_EXPR_SLOT (cfd.end_expr);
+	  else
+	    cfd.end_var = get_formal_tmp_var (cfd.end_var, pre_p);
+	}
+      else if (ecode == CALL_EXPR)
+	cfd.end_var = cfd.end_expr;
+      else
+	{
+	  tree ii_tree = cfd.end_expr;
+	  while (TREE_CODE_CLASS (TREE_CODE (ii_tree)) == tcc_unary)
+	    ii_tree = TREE_OPERAND (ii_tree, 0);
+	  if (TREE_CODE (ii_tree) == ADDR_EXPR)
+	    ii_tree = TREE_OPERAND (ii_tree, 0);
+	  ecode = TREE_CODE (ii_tree);
+	  tree tmp_var = cilk_tree_operand_noconv (cfd.end_expr);
+	  cfd.end_var = get_formal_tmp_var (tmp_var, pre_p);
+	  order_variable = true;
+	}
+    }
+  tree cond = cfd.cond;
+  tree op1 = TREE_OPERAND (cond, 1);
+  tree op0 = TREE_OPERAND (cond, 0);
+  enum tree_code cond_code = TREE_CODE (cond);
+
+  /* In this case below, we have an overloaded boolean comparison operation.  */
+  if (cond_code == CALL_EXPR)
+    {
+      cond_code = cilk_find_code_from_call (CALL_EXPR_FN (cond));
+      op1 = cilk_tree_operand_noconv (CALL_EXPR_ARG (cond, 1));
+      op0 = cilk_tree_operand_noconv (CALL_EXPR_ARG (cond, 0));
+      if (TREE_CODE (op0) == ADDR_EXPR || TREE_CODE (op0) == INDIRECT_REF)
+	op0 = TREE_OPERAND (op0, 0);
+    }
+  if (order_variable && op1 == cfd.end_expr)
+    op1 = cfd.end_var;
+  else if (order_variable && op0 == cfd.end_expr)
+    op0 = cfd.end_var;
+  
+  cond = callable (cfd.loc, cond_code, op0, op1, false);
+  gcc_assert (cond != NULL_TREE);
+
+  if (TREE_CODE (TREE_TYPE (cond)) != BOOLEAN_TYPE)
+    cond = perform_implicit_conversion (boolean_type_node, cond,
+					tf_warning_or_error);
+  enum tree_code div_op = NOP_EXPR;
+  tree forward = NULL_TREE, count_up = NULL_TREE, count_down = NULL_TREE;
+  cilk_calc_forward_div_op (&cfd, &div_op, &forward);
+  if (cfd.iterator)
+    calc_count_up_count_down (&cfd, &count_up, &count_down);
+  
+  tree count = cilk_compute_loop_count (&cfd, div_op, forward, count_up,
+					count_down);
+  tree fn = create_cilk_for_nested_fn (&cfd, pre_p);
+  if (fn == error_mark_node)
+    return;
+  cfd.cond = cond;
+  
+  current_function_decl = parent_function;
+  gimple_seq inner_seq = insert_cilk_for_nested_fn (&cfd, count, fn);
+  gimple_seq_add_seq (pre_p, inner_seq);
+}
+
+/* Extract all the relevant information from CFOR, a CILK_FOR_STMT tree
+   and store them in CFD structure.  */
+
+static void
+cp_extract_cilk_for_fields (struct cilk_for_desc *cfd, tree cfor)
+{
+  cfd->var = CILK_FOR_VAR (cfor);
+  cfd->cond = CILK_FOR_COND (cfor);
+  cfd->lower_bound = CILK_FOR_INIT (cfor);
+  cfd->incr = CILK_FOR_EXPR (cfor);
+  cfd->loc = EXPR_LOCATION (cfor);
+  cfd->body = CILK_FOR_BODY (cfor);
+  cfd->grain = CILK_FOR_GRAIN (cfor);
+  cfd->invalid = false;
+
+  /* This function shouldn't be setting these two variables.  */
+  cfd->ctx_arg = NULL_TREE;
+  cfd->count = NULL_TREE;
+  
+  cilk_set_init_info (cfd);
+  cilk_set_inclusive_and_direction (cfd);
+  cilk_set_iter_difftype (cfd);
+
+  if (cfd->iterator)
+    {
+      tree exp = NULL_TREE;
+      tree hack = build_decl (cfd->loc, VAR_DECL, NULL_TREE,
+			      TREE_TYPE (cfd->var));
+      if (cfd->direction >= 0)
+	exp = callable (cfd->loc, MINUS_EXPR, hack, cfd->var,true);
+      else
+	exp = callable (cfd->loc, MINUS_EXPR, cfd->var, hack, true);
+      if (!exp) 
+	{ 
+	  cfd->invalid = true;
+	  return;
+	}
+      cfd->difference_type = TYPE_MAIN_VARIANT (TREE_TYPE (exp));
+    }
+  cfd->count_type = cilk_check_loop_difference_type (cfd->difference_type);
+  cilk_set_incr_info (cfd, true);
+}
+
+/* Entry function to gimplify a CILK_FOR_STMT, *FOR_P.  *PRE_P and *POST_P are
+    preceeding and proceeding gimple sequences of *FOR_P, respectively.  */
+
+int
+cp_gimplify_cilk_for (tree *for_p, gimple_seq *pre_p,
+		      gimple_seq *post_p ATTRIBUTE_UNUSED)
+{
+  struct cilk_for_desc cfd;
+
+  cfun->is_cilk_function = 1;
+  cilk_init_cfd (&cfd);
+
+  cp_extract_cilk_for_fields (&cfd, *for_p);
+  if (cfd.invalid)
+    {
+      *for_p = build_empty_stmt (cfd.loc);
+      return GS_ERROR;
+    }
+  cfd.nested_ok = !DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (current_function_decl);
+  gimplify_cilk_for_1 (cfd, pre_p);
+  *for_p = NULL_TREE;
+
+  return GS_ALL_DONE;
+}
+
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index c464719..b40e9a6 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -269,6 +269,23 @@ genericize_cp_loop (tree *stmt_p, location_t start_locus, tree cond, tree body,
   *stmt_p = stmt_list;
 }
 
+/* Genericize a CILK_FOR_STMT node *STMT_P.  */
+
+static void
+genericize_cilk_for_stmt (tree *stmt_p, int *walk_subtrees, void *data)
+{
+  tree stmt = *stmt_p;
+  cp_walk_tree (&CILK_FOR_COND (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_INIT (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_GRAIN (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_VAR (stmt), cp_genericize_r, data, NULL);
+  cp_walk_tree (&CILK_FOR_EXPR (stmt), cp_genericize_r, data, NULL);
+
+  /* _Cilk_for body will be resolved after it is inserted into a nested
+     function.  */
+  *walk_subtrees = 0;
+} 
+
 /* Genericize a FOR_STMT node *STMT_P.  */
 
 static void
@@ -1121,6 +1138,8 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void *data)
     gcc_assert (!CONVERT_EXPR_VBASE_PATH (stmt));
   else if (TREE_CODE (stmt) == FOR_STMT)
     genericize_for_stmt (stmt_p, walk_subtrees, data);
+  else if (TREE_CODE (stmt) == CILK_FOR_STMT)
+    genericize_cilk_for_stmt (stmt_p, walk_subtrees, data);
   else if (TREE_CODE (stmt) == WHILE_STMT)
     genericize_while_stmt (stmt_p, walk_subtrees, data);
   else if (TREE_CODE (stmt) == DO_STMT)
diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h
index eb81cb2..4d1c45e 100644
--- a/gcc/cp/cp-objcp-common.h
+++ b/gcc/cp/cp-objcp-common.h
@@ -167,4 +167,7 @@ extern void cp_common_init_ts (void);
 #undef  LANG_HOOKS_CILKPLUS_FRAME_CLEANUP
 #define LANG_HOOKS_CILKPLUS_FRAME_CLEANUP cp_cilk_install_body_wframe_cleanup
 
+#undef  LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR
+#define LANG_HOOKS_CILKPLUS_GIMPLIFY_CILK_FOR cp_gimplify_cilk_for
+
 #endif /* GCC_CP_OBJCP_COMMON */
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 77daeb8..605f9b0 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5687,6 +5687,10 @@ extern void finish_for_init_stmt		(tree);
 extern void finish_for_cond			(tree, tree, bool);
 extern void finish_for_expr			(tree, tree);
 extern void finish_for_stmt			(tree);
+extern tree begin_cilk_for_stmt                 (tree, tree);
+extern void finish_cilk_for_init_stmt           (tree);
+extern tree finish_cilk_for_stmt                (tree);
+extern tree finish_cilk_for_cond                (tree);
 extern tree begin_range_for_stmt		(tree, tree);
 extern void finish_range_for_decl		(tree, tree, tree);
 extern void finish_range_for_stmt		(tree);
@@ -6182,6 +6186,8 @@ extern void vtv_build_vtable_verify_fndecl      (void);
 extern bool cpp_validate_cilk_plus_loop		(tree);
 extern void cp_cilk_install_body_wframe_cleanup (tree, tree);
 extern tree cp_cilk_copy_tree_body_r            (tree *, int *, void *);
+extern int cp_gimplify_cilk_for                 (tree *, gimple_seq *,
+						 gimple_seq *);
 
 /* In cp/cp-array-notations.c */
 extern tree expand_array_notation_exprs         (tree);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 0f9b29b..aac3ca5 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -235,6 +235,10 @@ static tree cp_literal_operator_id
 
 static void cp_parser_cilk_simd
   (cp_parser *, cp_token *);
+static tree cp_parser_cilk_for
+  (cp_parser *, tree);
+static void cp_parser_cilk_grainsize
+  (cp_parser *, cp_token *);
 static bool cp_parser_omp_declare_reduction_exprs
   (tree, cp_parser *);
 
@@ -2060,7 +2064,8 @@ static tree cp_parser_decltype
 /* Declarators [gram.dcl.decl] */
 
 static tree cp_parser_init_declarator
-  (cp_parser *, cp_decl_specifier_seq *, vec<deferred_access_check, va_gc> *, bool, bool, int, bool *, tree *);
+  (cp_parser *, cp_decl_specifier_seq *, vec<deferred_access_check, va_gc> *,
+   bool, bool, int, bool *, tree *, tree *);
 static cp_declarator *cp_parser_declarator
   (cp_parser *, cp_parser_declarator_kind, int *, bool *, bool);
 static cp_declarator *cp_parser_direct_declarator
@@ -9353,6 +9358,7 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 
 	case RID_WHILE:
 	case RID_DO:
+	case RID_CILK_FOR:
 	case RID_FOR:
 	  statement = cp_parser_iteration_statement (parser, false);
 	  break;
@@ -10508,6 +10514,17 @@ cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
       }
       break;
 
+    case RID_CILK_FOR:
+      if (!flag_enable_cilkplus)
+	{ 
+	  error_at (token->location, 
+		    "-fcilkplus must be enabled t use %<_Cilk_for%>");
+	  statement = error_mark_node;
+	}
+      else
+	statement = cp_parser_cilk_for (parser, NULL_TREE);
+      break;
+
     default:
       cp_parser_error (parser, "expected iteration-statement");
       statement = error_mark_node;
@@ -10627,9 +10644,15 @@ cp_parser_jump_statement (cp_parser* parser)
 	case IN_OMP_FOR:
 	  error_at (token->location, "break statement used with OpenMP for loop");
 	  break;
+
 	case IN_CILK_SIMD_FOR:
 	  error_at (token->location, "break statement used with Cilk Plus for loop");
 	  break;
+
+	case IN_CILK_FOR_STMT:
+	  error_at (token->location,
+		    "break statement used in _Cilk_for loop body");
+	  break;
 	}
       cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
       break;
@@ -10645,6 +10668,7 @@ cp_parser_jump_statement (cp_parser* parser)
 		    "continue statement within %<#pragma simd%> loop body");
 	  /* Fall through.  */
 	case IN_ITERATION_STMT:
+	case IN_CILK_FOR_STMT:
 	case IN_OMP_FOR:
 	  statement = finish_continue_stmt ();
 	  break;
@@ -11191,7 +11215,7 @@ cp_parser_simple_declaration (cp_parser* parser,
 					/*member_p=*/false,
 					declares_class_or_enum,
 					&function_definition_p,
-					maybe_range_for_decl);
+					maybe_range_for_decl, NULL);
       /* If an error occurred while parsing tentatively, exit quickly.
 	 (That usually happens when in the body of a function; each
 	 statement is treated as a declaration-statement until proven
@@ -16442,7 +16466,8 @@ cp_parser_init_declarator (cp_parser* parser,
 			   bool member_p,
 			   int declares_class_or_enum,
 			   bool* function_definition_p,
-			   tree* maybe_range_for_decl)
+			   tree* maybe_range_for_decl,
+			   tree* init)
 {
   cp_token *token = NULL, *asm_spec_start_token = NULL,
            *attributes_start_token = NULL;
@@ -16450,7 +16475,9 @@ cp_parser_init_declarator (cp_parser* parser,
   tree prefix_attributes;
   tree attributes = NULL;
   tree asm_specification;
-  tree initializer;
+  /* Initialize initalizer to remove a "using potentially unset variable"
+     warning/error.  */
+  tree initializer = NULL_TREE;
   tree decl = NULL_TREE;
   tree scope;
   int is_initialized;
@@ -16587,7 +16614,8 @@ cp_parser_init_declarator (cp_parser* parser,
 	      DECL_STRUCT_FUNCTION (decl)->function_start_locus
 		= func_brace_location;
 	    }
-
+	  if (init)
+	    *init = initializer;
 	  return decl;
 	}
     }
@@ -16822,6 +16850,8 @@ cp_parser_init_declarator (cp_parser* parser,
 	finish_fully_implicit_template (parser, /*member_decl_opt=*/0);
     }
 
+  if (init)
+    *init = initializer;
   return decl;
 }
 
@@ -22987,6 +23017,7 @@ cp_parser_single_declaration (cp_parser* parser,
 				        member_p,
 				        declares_class_or_enum,
 				        &function_definition_p,
+					NULL,
 					NULL);
 
     /* 7.1.1-1 [dcl.stc]
@@ -31256,6 +31287,21 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
       cp_parser_cilk_simd (parser, pragma_tok);
       return true;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (context == pragma_external)
+	{
+	  error_at (pragma_tok->location,
+		    "%<#pragma cilk grainsize%> may only be be used inside a "
+		    "function");
+	  break;
+	}
+
+      /* Ignore the pragma if Cilk Plus is not enabled.  */
+      if (flag_enable_cilkplus)
+	{
+	  cp_parser_cilk_grainsize (parser, pragma_tok);
+	  return true;
+	}
     default:
       gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
       c_invoke_pragma_handler (id);
@@ -31572,6 +31618,213 @@ cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token)
   return;
 }
 
+static tree
+cp_parser_cilk_for_init_statement (cp_parser *parser, tree *init)
+{
+  cp_token *token = cp_lexer_peek_token (parser->lexer);
+  location_t loc = token->location;
+  tree decl_init = NULL_TREE;
+  if (token->type == CPP_SEMICOLON)
+    {
+      error_at (loc, "expected induction variable");
+      return error_mark_node;
+    }
+
+  if (cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_REGISTER)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_EXTERN)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_MUTABLE)
+      || cp_lexer_next_token_is_keyword (parser->lexer, RID_THREAD))
+    {
+      error_at (loc, "storage class is not allowed");
+      cp_lexer_consume_token (parser->lexer);
+    }
+
+  if (token->type == CPP_NAME)
+    {
+      tree type = cp_parser_lookup_name_simple (parser, token->u.value, loc);
+      if (TREE_CODE (type) == VAR_DECL || TREE_CODE (type) == PARM_DECL)
+	{
+	  error_at (loc, "_Cilk_for loop initializer must declare variable");
+	  cp_parser_skip_to_end_of_statement (parser);
+	  return error_mark_node;
+	}
+    }
+  int flags = 0;
+  cp_decl_specifier_seq specs;
+  cp_parser_decl_specifier_seq (parser, CP_PARSER_FLAGS_NONE, &specs, &flags);
+  tree decl = cp_parser_init_declarator (parser, &specs, NULL, false, false,
+					 flags, NULL, NULL, &decl_init);
+  /* Sometimes if the initial is constant, it won't save in DECL_INITIAL,
+     and thus we need to get the initial value.  Now, if it saved the
+     DECL_INITIAL value, then just use it since it will have all the
+     necessary type casting.  */
+  if (DECL_INITIAL (decl))
+      decl_init = DECL_INITIAL (decl);
+
+  
+  if (processing_template_decl)
+    add_stmt (decl_init);
+  else
+    *init = decl_init;
+  parser->scope = NULL_TREE;
+  parser->qualifying_scope = NULL_TREE;
+  parser->object_scope = NULL_TREE;
+
+  if (decl == error_mark_node || DECL_INITIAL (decl) == error_mark_node
+      || TREE_TYPE (decl) == error_mark_node)
+    {
+      cp_parser_skip_to_end_of_statement (parser);
+      gcc_assert (errorcount || sorrycount);
+      return error_mark_node;
+    }
+  return decl;
+}
+
+static void
+cp_parser_cilk_grainsize (cp_parser *parser, cp_token *pragma_tok)
+{
+  if (cp_parser_require (parser, CPP_EQ, RT_EQ))
+    {
+      tree exp = cp_parser_binary_expression (parser, false, false,
+					      PREC_NOT_OPERATOR, NULL);
+      cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+      if (!exp || exp == error_mark_node)
+	{
+	  error_at (pragma_tok->location, "invalid grainsize for _Cilk_for");
+	  return;
+	}
+      cp_token *n_tok = cp_lexer_peek_token (parser->lexer);
+
+      /* Make sure the next token is _Cilk_for, it is invalid otherwise.  */
+      if (n_tok && n_tok->type == CPP_KEYWORD && n_tok->keyword == RID_CILK_FOR)
+	{
+	  cp_lexer_consume_token (parser->lexer);
+	  tree cfor = cp_parser_cilk_for (parser, exp);
+	  if (cfor && STATEMENT_CODE_P (TREE_CODE (cfor)))
+	    SET_EXPR_LOCATION (cfor, n_tok->location);
+	}
+      else
+	warning (0, "%<#pragma cilk grainsize%> is not followed by "
+		 "%<_Cilk_for%>");
+      return;
+    }
+  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+}
+
+/* Top-level function to parse _Cilk_for and the for statement
+   following <#pragma simd>.  */
+
+static tree
+cp_parser_cilk_for (cp_parser *parser, tree grain)
+{
+  bool valid = true;
+  tree cond = NULL_TREE;
+  tree incr_expr = NULL_TREE;
+  tree init = NULL_TREE;
+  location_t loc = cp_lexer_peek_token (parser->lexer)->location;
+
+  tree scope = begin_for_scope (&init); 
+  tree statement = begin_cilk_for_stmt (scope, init);
+
+  if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+    {
+      cp_parser_skip_to_end_of_statement (parser);
+      return error_mark_node;
+    }
+
+  /* Parse initialization.  */ 
+  tree decl = cp_parser_cilk_for_init_statement (parser, &init);
+    
+  if (decl == error_mark_node)
+    valid = false;
+  else if (!decl || (TREE_CODE (decl) != VAR_DECL
+		     && TREE_CODE (decl) != DECL_EXPR))
+    {
+      error_at (loc, "_Cilk_for loop initializer does not declare a variable");
+      valid = false;
+      decl = error_mark_node;
+    }
+  if (cp_lexer_next_token_is (parser->lexer, CPP_COMMA))
+    {
+      error_at (loc, "_Cilk_for loop initializer cannot have multiple variable"
+		" declarations");
+      cp_parser_skip_to_end_of_statement (parser);
+      valid = false;
+    }
+
+  if (!valid)
+    /* Skip to the semicolon ending the init.  */
+    cp_parser_skip_to_end_of_statement (parser);
+  else
+    {
+      CILK_FOR_INIT (statement) = init;
+      CILK_FOR_VAR (statement) = decl;
+      finish_cilk_for_init_stmt (statement);
+    }
+
+  /* Parse condition.  */
+  if (!cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON))
+    return error_mark_node;
+  if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
+    {
+      error_at (loc, "missing condition");
+      cond = error_mark_node;
+    }
+  else
+    { 
+      cond = cp_parser_condition (parser);
+      cond = finish_cilk_for_cond (cond); 
+      CILK_FOR_COND (statement) = cond;
+    }
+
+  if (cond == error_mark_node)
+    valid = false;
+  cp_parser_consume_semicolon_at_end_of_statement (parser);
+
+  /* Parse increment.  */
+  if (cp_lexer_next_token_is (parser->lexer, CPP_CLOSE_PAREN))
+    {
+      error_at (loc, "missing increment");
+      incr_expr = error_mark_node;
+    }
+  else
+    incr_expr = cp_parser_expression (parser, false, NULL);
+  if (TREE_CODE (incr_expr) == ERROR_MARK)
+    {
+      cp_parser_skip_to_closing_parenthesis (parser, true, false, false);
+      valid = false;
+    }
+  if (!cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
+    {
+      cp_parser_skip_to_end_of_statement (parser);
+      valid = false;
+    }
+  
+  if (!valid)
+    {
+      gcc_assert (sorrycount || errorcount);
+      return error_mark_node;
+    }
+  
+  finish_for_expr (incr_expr, statement);
+  CILK_FOR_EXPR (statement) = incr_expr;
+  int saved_in_statement = parser->in_statement;
+  parser->in_statement = IN_CILK_FOR_STMT;
+  cp_parser_already_scoped_statement (parser);
+  parser->in_statement = saved_in_statement;
+  
+  CILK_FOR_GRAIN (statement) = grain;
+  statement = finish_cilk_for_stmt (statement);
+
+  /* Check if the body satisfies all the requirement of _Cilk_for.
+     If invalid, then just return error_mark_node.  */
+  if (statement == error_mark_node
+      || !cpp_validate_cilk_plus_loop (CILK_FOR_BODY (statement)))
+    return error_mark_node;
+  return statement;
+}
+
 /* Create an identifier for a generic parameter type (a synthesized
    template parameter implied by `auto' or a concept identifier). */
 
diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
index e26e350..8d1ce44 100644
--- a/gcc/cp/parser.h
+++ b/gcc/cp/parser.h
@@ -302,6 +302,8 @@ typedef struct GTY(()) cp_parser {
 #define IN_IF_STMT             16
 #define IN_CILK_SIMD_FOR       32
 #define IN_CILK_SPAWN          64
+#define IN_CILK_FOR_STMT       128
+  
   unsigned char in_statement;
 
   /* TRUE if we are presently parsing the body of a switch statement.
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 1b34434..302163f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13335,6 +13335,45 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
       finish_for_stmt (stmt);
       break;
 
+    case CILK_FOR_STMT:
+      {
+	stmt = begin_cilk_for_stmt (NULL_TREE, NULL_TREE);
+	CILK_FOR_INIT (stmt) = RECUR (CILK_FOR_INIT (t));
+	finish_cilk_for_init_stmt (stmt);
+	tmp = RECUR (CILK_FOR_VAR (t));
+	CILK_FOR_VAR (stmt) = tmp;
+	CILK_FOR_GRAIN (stmt) = CILK_FOR_GRAIN (t);
+
+	tmp = CILK_FOR_COND (t);
+	if (COMPARISON_CLASS_P (tmp))
+	  {
+	    tree op0 = RECUR (TREE_OPERAND (tmp, 0));
+	    tree op1 = RECUR (TREE_OPERAND (tmp, 1));
+	    tmp = build2 (TREE_CODE (tmp), boolean_type_node, op0, op1);
+	  }
+	CILK_FOR_COND (stmt) = tmp;
+
+	tmp = CILK_FOR_EXPR (t);
+	if (TREE_CODE (tmp) == MODIFY_EXPR)
+	  {
+	    tree lhs = TREE_OPERAND (tmp, 0);
+	    tree rhs = TREE_OPERAND (tmp, 1);
+	    lhs = RECUR (lhs);
+	    rhs = build2 (TREE_CODE (rhs), TREE_TYPE (lhs),
+			  RECUR (TREE_OPERAND (rhs, 0)),
+			  RECUR (TREE_OPERAND (rhs, 1)));
+	    tmp = build2 (MODIFY_EXPR, void_type_node, lhs, rhs);
+	  }
+	else
+	  tmp = build2 (TREE_CODE (tmp), void_type_node,
+			RECUR (TREE_OPERAND (tmp, 0)),
+			RECUR (TREE_OPERAND (tmp, 1)));
+	finish_for_expr (tmp, stmt);
+	RECUR (CILK_FOR_BODY (t));
+	stmt = finish_cilk_for_stmt (stmt);
+	CILK_FOR_GRAIN (stmt) = RECUR (CILK_FOR_GRAIN (t));	
+	break;
+      }
     case RANGE_FOR_STMT:
       {
         tree decl, expr;
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index a07b0ef..3086009 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -826,7 +826,8 @@ finish_return_stmt (tree expr)
   return r;
 }
 
-/* Begin the scope of a for-statement or a range-for-statement.
+/* Begin the scope of a for-statement _Cilk_for statement 
+   or a range-for-statement.
    Both the returned trees are to be used in a call to
    begin_for_stmt or begin_range_for_stmt.  */
 
@@ -899,7 +900,7 @@ finish_for_cond (tree cond, tree for_stmt, bool ivdep)
 }
 
 /* Finish the increment-EXPRESSION in a for-statement, which may be
-   given by FOR_STMT.  */
+   given by FOR_STMT or CILK_FOR_STMT.  */
 
 void
 finish_for_expr (tree expr, tree for_stmt)
@@ -926,7 +927,10 @@ finish_for_expr (tree expr, tree for_stmt)
   expr = maybe_cleanup_point_expr_void (expr);
   if (check_for_bare_parameter_packs (expr))
     expr = error_mark_node;
-  FOR_EXPR (for_stmt) = expr;
+  if (TREE_CODE (for_stmt) == CILK_FOR_STMT)
+    CILK_FOR_EXPR (for_stmt) = expr;
+  else
+    FOR_EXPR (for_stmt) = expr;
 }
 
 /* Finish the body of a for-statement, which may be given by
@@ -6655,6 +6659,18 @@ finish_omp_cancellation_point (tree clauses)
   finish_expr_stmt (stmt);
 }
 \f
+
+/* Perform any canonicalization of the conditional in a Cilk for loop.  */
+tree
+finish_cilk_for_cond (tree cond)
+{
+  if (!processing_template_decl)
+    return cp_truthvalue_conversion (cond);
+  else
+    return cond;
+}
+\f
+
 /* Begin a __transaction_atomic or __transaction_relaxed statement.
    If PCOMPOUND is non-null, this is for a function-transaction-block, and we
    should create an extra compound stmt.  */
@@ -10603,4 +10619,51 @@ capture_decltype (tree decl)
   return type;
 }
 
+/* Begin a _Cilk_for-statement.  Returns a new FOR_STMT.  
+   SCOPE and INIT should be the return of begin_for_scope, 
+   or both NULL_TREE  */
+
+tree
+begin_cilk_for_stmt (tree scope, tree init)
+{
+  tree cilk_for_stmt = build_stmt (input_location, CILK_FOR_STMT, NULL_TREE,
+				   NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE,
+				   NULL_TREE, NULL_TREE);
+  if (scope == NULL_TREE)
+    {
+      if (!init)
+	scope = begin_for_scope (&init);
+    }
+  CILK_FOR_INIT (cilk_for_stmt) = init;
+  CILK_FOR_SCOPE (cilk_for_stmt) = scope;
+  return cilk_for_stmt;
+}
+
+/* Finish the for-init-statement of a for-statement, which may be given 
+   by C_FOR_STMT.  */
+
+void
+finish_cilk_for_init_stmt (tree c_for_stmt)
+{
+  if (processing_template_decl)
+    CILK_FOR_INIT (c_for_stmt) = pop_stmt_list (CILK_FOR_INIT (c_for_stmt));
+  CILK_FOR_BODY (c_for_stmt) = do_pushlevel (sk_block);
+}
+
+/* Finish the body of a for-statement, which may be given by FOR_STMT.  
+   Returns a CILK_FOR_STMT that is type checked.  */
+
+tree
+finish_cilk_for_stmt (tree cilk_for_stmt)
+{
+  CILK_FOR_BODY (cilk_for_stmt) = do_poplevel (CILK_FOR_BODY (cilk_for_stmt));
+  tree *scope_ptr = &CILK_FOR_SCOPE (cilk_for_stmt);
+  tree scope = *scope_ptr;
+  *scope_ptr = NULL;
+  add_stmt (do_poplevel (scope));
+  cp_finish_cilk_for_loop (&cilk_for_stmt, processing_template_decl);
+  add_stmt (cilk_for_stmt);
+  return cilk_for_stmt;
+}
+
 #include "gt-cp-semantics.h"
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc
new file mode 100644
index 0000000..dec650c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-start-at-5.cc
@@ -0,0 +1,42 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+
+int j[10];
+
+int main(void)
+{
+  int error = 0;
+  int j_serial[10];
+  for (int ii = 0; ii < 10; ii++)
+    {
+      j[ii] = 10;
+      j_serial[ii] = 10;
+    }
+  _Cilk_for (int ii = 5; ii < 10; ii++)
+    {
+      j[ii]=ii;
+    }
+
+  for (int ii = 5; ii < 10; ii++)
+    {
+      j_serial[ii] = ii;
+    }
+
+  for (int ii = 0; ii < 10; ii++)
+    {
+      if (j[ii] != j_serial[ii]) 
+	error = 1;    
+    }
+
+  if (error)
+    __builtin_abort ();
+  else
+    return 0;
+
+  return j[9];
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
new file mode 100644
index 0000000..8221371
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
@@ -0,0 +1,25 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define SIZE 100
+#define CHECK_VALUE 5
+
+template <class T>
+int func (T start, T end)
+{
+  int Array[SIZE];
+  _Cilk_for (T ii = 0; ii < end; ii++)
+    Array[ii] = CHECK_VALUE;
+  
+  for (T ii = 0; ii < end; ii++)
+    if (Array[ii] != CHECK_VALUE)
+      __builtin_abort ();
+
+  return 0;
+}
+
+int main (void)
+{
+  return func <int> (0, 100) + func <long> (0, 100);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for.cc
new file mode 100644
index 0000000..30ea29d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for.cc
@@ -0,0 +1,34 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int main(int argc, char **argv)
+{
+  char Array1[26], Array2[26];
+  char Array1_Serial[26], Array2_Serial[26];
+
+  for (int ii = 0; ii < 26; ii++)  
+    { 
+      Array1[ii] = 'A'+ii;
+      Array1_Serial[ii] = 'A'+ii;
+    }
+  for (int ii = 0; ii < 26; ii++)
+    {
+      Array2[ii] = 'a'+ii;
+      Array2_Serial[ii] = 'a'+ii;
+    }
+
+  _Cilk_for (int ii = 0 ; ii < 26; ii++) 
+    Array1[ii] = Array2[ii];
+
+  for (int ii = 0; ii < 26; ii++)
+    Array1_Serial[ii] = Array2_Serial[ii];
+
+  for (int ii = 0; ii < 26; ii++)  {
+    if (Array1_Serial[ii] != Array1[ii])  { 
+	__builtin_abort ();
+    }
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc
new file mode 100644
index 0000000..3759a36
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_inside_for.cc
@@ -0,0 +1,22 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int q[10], seq[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii=max - 1; ii>=start; ii--) 
+	{ 
+	  for (int jj = 0; jj < 10; jj++)  
+	    {
+	      if (seq[jj] == 5)
+		continue;
+	      else
+		seq[jj] = 2;
+	    }
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc
new file mode 100644
index 0000000..38c4d51
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_for.cc
@@ -0,0 +1,19 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int q[10], seq2[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii=max - 1; ii>=start; ii--) 
+	{ 
+	  for (int jj = 0; jj < 10; jj++) 
+	    seq2[jj] = 5;
+	  continue;
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc
new file mode 100644
index 0000000..e68c700
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_if.cc
@@ -0,0 +1,18 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int q[10], seq2[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii = max - 1; ii >= start; ii--) 
+	{ 
+	  if (q[ii] != 0) 
+	    continue;
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc
new file mode 100644
index 0000000..17fd064
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_cont_with_while.cc
@@ -0,0 +1,23 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int q[10], seq2[10];
+int main (int argc, char** argv)
+{
+   
+    int max = 10, start = 0;
+      _Cilk_for(int ii=max - 1; ii>=start; ii--) 
+	{ 
+	  int jj = 0;
+	  while (jj < 10)
+	    {
+	      seq2[jj] = 1;
+	      jj++;
+	    }
+	  continue;
+	}
+        return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc
new file mode 100644
index 0000000..f0ad2a3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_genricize_test.cc
@@ -0,0 +1,42 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <assert.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <vector>
+#include <list>
+#if HAVE_IO 
+#include <stdio.h>
+#endif
+#define NUMBER 500
+#include <stdlib.h>
+typedef std::pair<int, int> my_type_t;
+
+long
+valid_pairs(std::vector< my_type_t > my_list) 
+{
+  _Cilk_for (int ii = 0; ii < my_list.size(); ii++) 
+    {
+#if HAVE_IO
+    fprintf(stderr, "my_list index: %d, size: %zu.\n", ii, my_list.size());
+#endif
+      if (ii < 0 || ii >= my_list.size())
+	abort (); 
+    }
+  return 0;
+}
+
+int main(int argc, char **argv) 
+{
+  std::vector<my_type_t> my_list;
+
+  for (int ii = 0; ii < NUMBER; ii++) 
+    my_list.push_back(my_type_t(ii, ii));
+  long res = valid_pairs(my_list);
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_grainsize.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_grainsize.cc
new file mode 100644
index 0000000..7d54828
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_grainsize.cc
@@ -0,0 +1,77 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+int x = 5;
+int q = 25;
+int z = 2;
+
+int square (int b)
+{
+  return (b*b);
+}
+
+template<class T>
+int templated_func (T a, T b, T c)
+{
+  T Array[10];
+#pragma cilk grainsize = a
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = a;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != a)
+      __builtin_abort ();
+
+#pragma cilk grainsize = square ((int) (b/c))
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = b;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != b)
+      __builtin_abort ();
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = c;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != c)
+      __builtin_abort ();
+
+  return 0;
+}
+
+ 
+
+int main (void)
+{
+  int Array[10];
+#pragma cilk grainsize = 5
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != 5)
+      __builtin_abort ();
+
+
+#pragma cilk grainsize = x
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 10;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != 10)
+      __builtin_abort ();
+
+#pragma cilk grainsize = square (z)
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 15;
+
+  for (int ii = 0; ii < 10; ii++)
+    if (Array[ii] != 15)
+      __builtin_abort ();
+
+  int r = 5, s=10, t =15;
+  return templated_func (r, s, t);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_p_errors.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_p_errors.cc
new file mode 100644
index 0000000..4c69712
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_p_errors.cc
@@ -0,0 +1,52 @@
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+
+int main (void)
+{
+  int a, iii = 0;
+  _Cilk_for (; iii < 10; iii++) /* { dg-error "expected induction variable" } */
+    a = 5;
+
+  _Cilk_for (iii = 0; iii < 10; iii++) /* { dg-error " must declare variable" } */
+    a = 5;
+
+  _Cilk_for (int qq = 0, jj = 0; qq < 10; qq++) /* { dg-error " initializer cannot have multiple variable declarations" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0, int jj = 0; ii < 10; ii++) /* { dg-error " initializer cannot have multiple variable declarations" } */
+    a = 5;
+
+  _Cilk_for (int rr = 0; ; rr++) /* { dg-error "missing condition" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii = 5; ii++) /* { dg-error "invalid controlling predicate" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii == 5; ii++) /* { dg-error "invalid controlling predicate" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii < 10;) /* { dg-error "missing increment" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii ) /* { dg-error "invalid increment expression" } */
+    a = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      a = 5;
+      if (ii == 5)
+	break; /* { dg-error "break statement used in _Cilk_for loop body" } */
+    }
+
+#pragma cilk grainsize 5 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    a = 5;
+
+#pragma Silk grainsize = 5 /* { dg-warning "ignoring #pragma Silk grainsize" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    a = 5;
+#pragma cilk grainsiz = 5 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    a = 5;
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_t_errors.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_t_errors.cc
new file mode 100644
index 0000000..b597764
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk_for_t_errors.cc
@@ -0,0 +1,30 @@
+/* { dg-options "-fcilkplus" } */
+
+#include <setjmp.h>
+int main (void)
+{
+  int a, iii = 0;
+
+  _Cilk_for (volatile int ii = 0; ii < 10; ii++) /* { dg-error "iteration variable cannot be volatile" } */
+    a = 5;
+
+  _Cilk_for (static int ii = 0; ii < 10; ii++) /* { dg-error "storage class is not allowed" } */
+    a = 5;
+  _Cilk_for (register int ii = 0; ii < 10; ii++) /* { dg-error "storage class is not allowed" } */
+    a = 5;
+
+  _Cilk_for (extern int ii = 0; ii < 10; ii++) /* { dg-error "storage class is not allowed" } */
+    a = 5;
+
+  _Cilk_for (float ii = 0.0; ii < 10.0; ii += 0.5) /* { dg-error "induction variable must be of integral record or pointer type" } */
+    a = 5;
+
+  jmp_buf env;
+  _Cilk_for (int ii = 0; ii < 10; ii++) 
+    {
+      a = 5;
+      setjmp (env); /* { dg-error "calls to setjmp are not allowed within" } */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/explicit_ctor.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/explicit_ctor.cc
new file mode 100644
index 0000000..89f6403
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/explicit_ctor.cc
@@ -0,0 +1,27 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+struct BruceBoxleitner {
+    int m;
+    BruceBoxleitner (int n = 0) : m(n) { }
+    BruceBoxleitner operator--() { --m; return *this; }
+};
+
+int operator- (BruceBoxleitner a, BruceBoxleitner b) { return a.m - b.m; }
+
+struct BruceLee {
+    int m;
+    explicit BruceLee (int n) : m(n) { }
+};
+
+bool operator> (BruceBoxleitner a, BruceLee b) { return a.m > b.m; }
+int operator- (BruceBoxleitner a, BruceLee b) { return a.m - b.m; }
+
+int main () {
+    _Cilk_for (BruceBoxleitner i = 10; i > BruceLee(0); --i)
+      ;
+    return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/label_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/label_test.cc
new file mode 100644
index 0000000..495e9b4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/label_test.cc
@@ -0,0 +1,26 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int main(void)
+{
+  int jj = 0;
+  int total = 0;
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if ((ii % 2) == 0)
+	goto hello_label;
+      else
+	goto world_label;
+
+hello_label:
+     total++;
+world_label:
+     total++;
+    }
+  if (total != 15)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/no-opp-overload-error.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/no-opp-overload-error.cc
new file mode 100644
index 0000000..582ef60
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/no-opp-overload-error.cc
@@ -0,0 +1,88 @@
+/* { dg-options "-fcilkplus" } */
+
+
+#define NUMBER_OF_ELEMENTS 10
+
+#include <cstdlib>
+
+class my_class {
+private:
+  int value;
+public:
+
+  my_class ();
+  my_class (const my_class &val);
+  my_class (my_class &val);
+  my_class (int val);
+  ~my_class ();
+  int getValue();
+  my_class &operator= (my_class &new_value);
+  my_class &operator= (int x);
+  my_class &operator+= (int val)
+  {
+    value += val;
+    return *this;
+  }
+  bool operator< (const my_class &val)
+  {
+    return (value < val.value);
+  }
+};
+
+
+my_class::my_class ()
+{
+  value = 0;
+}
+
+my_class::my_class(int val)
+{
+  value = val;
+}
+
+my_class::my_class (my_class &val)
+{
+  value = val.value;
+}
+
+my_class::my_class (const my_class &val)
+{
+  value = val.value;
+}
+
+my_class::~my_class ()
+{
+  value = -1;
+}
+
+int my_class::getValue ()
+{
+  return value;
+}
+
+my_class & my_class::operator= (my_class &new_value)
+{
+  value = new_value.value;
+  return *this;
+}
+
+my_class &my_class::operator= (int x)
+{
+  value = x;
+  return *this;
+}
+
+int main (void)
+{
+  int n, *array_parallel;
+  my_class length (NUMBER_OF_ELEMENTS);
+    n = NUMBER_OF_ELEMENTS;
+  
+  array_parallel = new int[NUMBER_OF_ELEMENTS];
+  _Cilk_for (my_class ii (0); ii < length; ii += 1) { /* { dg-error " No operator-" } */
+      int x = ii.getValue();
+    array_parallel [x] = x * 2;
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-one.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-one.cc
new file mode 100644
index 0000000..1326308
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-one.cc
@@ -0,0 +1,59 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#if HAVE_IO
+#include <cstdio>
+#endif
+#define TEST 1
+
+
+#define ITER 300
+
+int n_errors;
+#if TEST
+void test (int *array, int n, int val) {
+#if HAVE_IO
+    for (int i = 0; i < n; i++)
+      std::printf("array[%3d] = %2d\n", i, array[i]);
+#endif
+    for (int i = 0; i < n; ++i) {
+        if (array[i] != val) {
+           __builtin_abort (); 
+        }
+    }
+}
+#endif
+ 
+
+int main () {
+    int array[ITER];
+  
+    for (int ii = 0; ii < ITER; ii++)
+      array[ii] = 9;
+    _Cilk_for (int *j = (array); j < array + ITER; j += 1)  {
+       *j = 6; 
+    }
+#if TEST
+    test(array, ITER, 6);
+#endif
+
+    _Cilk_for (int *i = array; i < array + ITER; i += 1) {
+        *i = 1;
+    }
+
+#if TEST
+    test(array, ITER, 1);
+#endif
+
+    _Cilk_for (int *k = array+ITER-1; k >= array; k -= 1) {
+        *k = 8;
+    }
+#if TEST
+    test(array, ITER, 8);
+#endif
+  
+    return 0;
+
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-test.cc
new file mode 100644
index 0000000..0ca588d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/plus-equal-test.cc
@@ -0,0 +1,111 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#define NUMBER_OF_ELEMENTS 10
+
+#include <cstdlib>
+
+#if HAVE_IO
+#include <cstdio>
+#endif
+
+class my_class {
+private:
+  int value;
+public:
+
+  my_class ();
+  my_class (const my_class &val);
+  my_class (my_class &val);
+  my_class (int val);
+  ~my_class ();
+  int getValue();
+  my_class &operator= (my_class &new_value);
+  my_class &operator+= (int val)
+  {
+    value += val;
+    return *this;
+  }
+  bool operator< (const my_class &val)
+  {
+    return (value < val.value);
+  }
+};
+
+
+my_class::my_class ()
+{
+  value = 0;
+}
+
+my_class::my_class(int val)
+{
+  value = val;
+}
+
+my_class::my_class (my_class &val)
+{
+  value = val.value;
+}
+
+my_class::my_class (const my_class &val)
+{
+  value = val.value;
+}
+
+my_class::~my_class ()
+{
+  value = -1;
+}
+
+int my_class::getValue ()
+{
+  return value;
+}
+
+my_class & my_class::operator= (my_class &new_value)
+{
+  value = new_value.value;
+  return *this;
+}
+
+int operator- (my_class x, my_class y)
+{
+  int val_x = x.getValue ();
+  int val_y = y.getValue ();
+  return (val_x - val_y);
+}
+
+
+int main (void)
+{
+  int n, *array_parallel, *array_serial;
+  my_class length (NUMBER_OF_ELEMENTS);
+    n = NUMBER_OF_ELEMENTS;
+  
+  array_parallel = new int[NUMBER_OF_ELEMENTS];
+  array_serial = new int[NUMBER_OF_ELEMENTS];
+
+  _Cilk_for (my_class ii (0); ii < length; ii += 1) {
+#if HAVE_IO
+    std::printf("ii.getValue() = %d\n", ii.getValue ());
+#endif
+    array_parallel [ii.getValue ()] = ii.getValue() * 2;
+  }
+
+  for (my_class ii (0); ii < length; ii += 1)
+    array_serial [ii.getValue ()] = ii.getValue () * 2;
+  
+  for (int ii = 0; ii < NUMBER_OF_ELEMENTS; ii++)
+    if (array_serial[ii] != array_parallel[ii]) {
+#if HAVE_IO
+      std::printf("array_serial[%3d] = %6d\tarray_parallel[%3d] = %6d\n", ii,
+		  array_serial[ii], ii, array_parallel[ii]);
+#endif
+      __builtin_abort ();
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
new file mode 100644
index 0000000..e4f2ee5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
@@ -0,0 +1,58 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+int main(void)
+{
+vector <int> array;
+vector <int> array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back (ii);
+}
+#endif
+#if 1
+_Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end(); 
+	   iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+#endif
+for (vector<int>::iterator iter2 = array_serial.begin(); 
+     iter2 != array_serial.end(); iter2++)
+{
+   if (*iter2  == 6) 
+     *iter2 = 13;
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter3 = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+
+while (iter3 != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter3 != *iter_serial)
+    abort ();
+  iter3++;
+  iter_serial++;
+}
+
+return 0;
+}   
+
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
new file mode 100644
index 0000000..3e350a1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
@@ -0,0 +1,50 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <iostream>
+#include <cstdio>
+#include <cstdlib>
+#include <vector>
+#include <algorithm>
+#include <list>
+
+using namespace std;
+
+
+int main(int argc, char **argv)
+{
+  vector <int> number_list, number_list_serial;
+  int new_number = 0;
+  int no_elements = 0;
+  
+  if (argc != 2)
+  {
+    no_elements = 10000;
+  }
+
+
+  number_list.clear();
+  number_list_serial.clear();
+  for (int ii = 0; ii < no_elements; ii++)
+  {
+    number_list.push_back(new_number);
+    number_list_serial.push_back(new_number);
+  }
+
+  _Cilk_for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list[jj] = jj + no_elements;
+  }
+  for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list_serial[jj] = jj + no_elements;
+  }
+
+  for (int jj = 0; jj < no_elements; jj++)
+    if (number_list_serial[jj] != number_list[jj])
+      abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
index 90faca4..20a8b55 100644
--- a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
+++ b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
@@ -80,7 +80,7 @@ dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -f
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O1 -fcilkplus $ALWAYS_CFLAGS" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O2 -fcilkplus $ALWAYS_CFLAGS" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -fcilkplus $ALWAYS_CFLAGS" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -g -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -g -fcilkplus $ALWAYS_CFLAGS" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -g -O2 -fcilkplus $ALWAYS_CFLAGS" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -g -O3 -fcilkplus $ALWAYS_CFLAGS" " "
 dg-finish

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-26  8:16       ` Iyer, Balaji V
@ 2013-11-27 17:59         ` Jason Merrill
  2013-11-27 22:31           ` Jeff Law
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Merrill @ 2013-11-27 17:59 UTC (permalink / raw)
  To: Iyer, Balaji V, Aldy Hernandez; +Cc: gcc-patches, Jeff Law, rth, Jakub Jelinek

On 11/25/2013 11:03 PM, Iyer, Balaji V wrote:

On a broad note, I think there's a lot of OpenMP code you could be 
reusing here rather than writing it all again.  And that way Cilk code 
will benefit from improvements to OpenMP handling, and vice versa.  It 
probably makes sense to turn Cilk_for into an OMP_FOR loop, and then 
gimplify into GIMPLE_OMP_FOR, rather than create a new tree code and 
handle everything at the tree level.  But I don't know the OMP code well 
enough to suggest exactly how that would work.

Finer-grained comments:

> +  tree exp = build_new_op (EXPR_LOCATION (op1), code, flags, op0, op1,
> +			   NULL_TREE, NULL, tf_none);
> +  if (exp == error_mark_node)
> +    exp = build_x_modify_expr (EXPR_LOCATION (op1), op0, code, op1, tf_none);
> +  if (exp && exp != error_mark_node)
> +    return exp;

I thought you were changing this?

> +/* Handler for iterator to compute the loop variable.  ADD_OP indicates
> +   whether we need a '+' or '-' operation. LOW indicates the starting point
> +   and LOOP_VAR is the induction variable.  This functin returns an
> +   INIT_EXPR.  */

This comment still doesn't document VAR2.

"function"

> +  tree exp = build_new_op (loc, add_op, 0, low, loop_var, NULL_TREE, 0,
> +			   tf_none);
> +  gcc_assert (exp != error_mark_node);
> +  exp = cp_build_modify_expr (var2, INIT_EXPR, exp, tf_warning_or_error);

Looking at online Cilk documentation I see:

> The increment expression must add to or subtract from the control variable using one of the following supported operations:
> +=
> -=
> ++ (prefix or postfix)
> -- (prefix or postfix)

 From this, I think people would expect the increment to use a 
user-defined operator+=/-=/++/--, but your code above uses operator+/- 
instead.

> +		    "-fcilkplus must be enabled t use %<_Cilk_for%>");

"to"

> +cp_parser_cilk_for (cp_parser *parser, tree grain)

Please reuse cp_parser_omp_for, like Aldy did for #pragma simd 
(cp_parser_cilk_simd) rather than write yet another for-statement 
parser.  This should reduce the patch size quite a bit.

> +    case PRAGMA_CILK_GRAINSIZE:
> +      if (context == pragma_external)
> +	{
> +	  error_at (pragma_tok->location,
> +		    "%<#pragma cilk grainsize%> may only be be used inside a "
> +		    "function");
> +	  break;
> +	}
> +
> +      /* Ignore the pragma if Cilk Plus is not enabled.  */
> +      if (flag_enable_cilkplus)
> +	{
> +	  cp_parser_cilk_grainsize (parser, pragma_tok);
> +	  return true;
> +	}
>      default:

Do you mean to fall through to the default case if Cilk+ is not enabled?

> +	tmp = CILK_FOR_COND (t);
> +	if (COMPARISON_CLASS_P (tmp))
> +	  {
> +	    tree op0 = RECUR (TREE_OPERAND (tmp, 0));
> +	    tree op1 = RECUR (TREE_OPERAND (tmp, 1));
> +	    tmp = build2 (TREE_CODE (tmp), boolean_type_node, op0, op1);
> +	  }
> +	CILK_FOR_COND (stmt) = tmp;

Why not just recur into CILK_FOR_COND?

> +	tmp = CILK_FOR_EXPR (t);
> +	if (TREE_CODE (tmp) == MODIFY_EXPR)
> +	  {
> +	    tree lhs = TREE_OPERAND (tmp, 0);
> +	    tree rhs = TREE_OPERAND (tmp, 1);
> +	    lhs = RECUR (lhs);
> +	    rhs = build2 (TREE_CODE (rhs), TREE_TYPE (lhs),
> +			  RECUR (TREE_OPERAND (rhs, 0)),
> +			  RECUR (TREE_OPERAND (rhs, 1)));
> +	    tmp = build2 (MODIFY_EXPR, void_type_node, lhs, rhs);
> +	  }
> +	else
> +	  tmp = build2 (TREE_CODE (tmp), void_type_node,
> +			RECUR (TREE_OPERAND (tmp, 0)),
> +			RECUR (TREE_OPERAND (tmp, 1)));
> +	finish_for_expr (tmp, stmt);

And CILK_FOR_EXPR?

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-15 21:45 [PATCH] _Cilk_for for C and C++ Iyer, Balaji V
  2013-11-16  1:38 ` Aldy Hernandez
  2013-11-20  8:05 ` Aldy Hernandez
@ 2013-11-27 18:37 ` Jason Merrill
  2013-11-27 18:49   ` Jakub Jelinek
  2 siblings, 1 reply; 42+ messages in thread
From: Jason Merrill @ 2013-11-27 18:37 UTC (permalink / raw)
  To: Iyer, Balaji V, gcc-patches, Jeff Law
  Cc: Aldy Hernandez (aldyh@redhat.com), rth

On 11/15/2013 02:23 PM, Iyer, Balaji V wrote:
> One small thing that I have not done that Jakub and several other have asked me before is that, there are no tests in c-c++-common for _Cilk_for. The reason being that the syntax between C and C++ implementations are different. In C++, the induction variable must be defined in the initializer (e.g. it should start wth _Cilk_for (int ii = 0....)). In C, this is not allowed (e.g. it should start as _Cilk_for (ii = 0; ii < 10; ii++)).

That can be handled with #ifdef __cplusplus.

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-27 18:37 ` Jason Merrill
@ 2013-11-27 18:49   ` Jakub Jelinek
  2013-11-27 19:04     ` Aldy Hernandez
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Jelinek @ 2013-11-27 18:49 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Iyer, Balaji V, gcc-patches, Jeff Law,
	Aldy Hernandez (aldyh@redhat.com),
	rth

On Wed, Nov 27, 2013 at 12:48:11PM -0500, Jason Merrill wrote:
> On 11/15/2013 02:23 PM, Iyer, Balaji V wrote:
> >One small thing that I have not done that Jakub and several other have asked me before is that, there are no tests in c-c++-common for _Cilk_for. The reason being that the syntax between C and C++ implementations are different. In C++, the induction variable must be defined in the initializer (e.g. it should start wth _Cilk_for (int ii = 0....)). In C, this is not allowed (e.g. it should start as _Cilk_for (ii = 0; ii < 10; ii++)).
> 
> That can be handled with #ifdef __cplusplus.

It isn't allowed even in C99?  For OpenMP,
int a[30];

void
foo ()
{
  #pragma omp for
  for (int i = 0; i < 30; i++)
    a[i] = i;
}
is valid for C99.  So, perhaps you just want
/* { dg-additional-options "-std=c99" { target c } } */
in the c-c++-common tests?

	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-27 18:49   ` Jakub Jelinek
@ 2013-11-27 19:04     ` Aldy Hernandez
  0 siblings, 0 replies; 42+ messages in thread
From: Aldy Hernandez @ 2013-11-27 19:04 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jason Merrill, Iyer, Balaji V, gcc-patches, Jeff Law, rth

On 11/27/13 10:54, Jakub Jelinek wrote:
> On Wed, Nov 27, 2013 at 12:48:11PM -0500, Jason Merrill wrote:
>> On 11/15/2013 02:23 PM, Iyer, Balaji V wrote:
>>> One small thing that I have not done that Jakub and several other have asked me before is that, there are no tests in c-c++-common for _Cilk_for. The reason being that the syntax between C and C++ implementations are different. In C++, the induction variable must be defined in the initializer (e.g. it should start wth _Cilk_for (int ii = 0....)). In C, this is not allowed (e.g. it should start as _Cilk_for (ii = 0; ii < 10; ii++)).
>>
>> That can be handled with #ifdef __cplusplus.
>
> It isn't allowed even in C99?  For OpenMP,
> int a[30];
>
> void
> foo ()
> {
>    #pragma omp for
>    for (int i = 0; i < 30; i++)
>      a[i] = i;
> }
> is valid for C99.  So, perhaps you just want
> /* { dg-additional-options "-std=c99" { target c } } */
> in the c-c++-common tests?
>
> 	Jakub
>

Yup, that's what I did for the Cilk Plus pragma simd tests.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-27 17:59         ` Jason Merrill
@ 2013-11-27 22:31           ` Jeff Law
  2013-11-27 23:04             ` Iyer, Balaji V
  0 siblings, 1 reply; 42+ messages in thread
From: Jeff Law @ 2013-11-27 22:31 UTC (permalink / raw)
  To: Jason Merrill, Iyer, Balaji V, Aldy Hernandez
  Cc: gcc-patches, rth, Jakub Jelinek

On 11/27/13 10:06, Jason Merrill wrote:
> On 11/25/2013 11:03 PM, Iyer, Balaji V wrote:
>
> On a broad note, I think there's a lot of OpenMP code you could be
> reusing here rather than writing it all again.  And that way Cilk code
> will benefit from improvements to OpenMP handling, and vice versa.  It
> probably makes sense to turn Cilk_for into an OMP_FOR loop, and then
> gimplify into GIMPLE_OMP_FOR, rather than create a new tree code and
> handle everything at the tree level.  But I don't know the OMP code well
> enough to suggest exactly how that would work.
That's certainly the direction I'd like to see this work go as well.  To 
the fullest extent possible Cilk+ should be layering on top of the 
OpenMP 4 work -- ie, Cilk+ should really be dealing with parsing issues, 
then handoff to OpenMP for the real work.

Jeff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2013-11-27 22:31           ` Jeff Law
@ 2013-11-27 23:04             ` Iyer, Balaji V
  2013-11-28  8:31               ` Jason Merrill
  0 siblings, 1 reply; 42+ messages in thread
From: Iyer, Balaji V @ 2013-11-27 23:04 UTC (permalink / raw)
  To: Jeff Law, Jason Merrill, Aldy Hernandez; +Cc: gcc-patches, rth, Jakub Jelinek



> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: Wednesday, November 27, 2013 2:43 PM
> To: Jason Merrill; Iyer, Balaji V; Aldy Hernandez
> Cc: gcc-patches@gcc.gnu.org; rth@redhat.com; Jakub Jelinek
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On 11/27/13 10:06, Jason Merrill wrote:
> > On 11/25/2013 11:03 PM, Iyer, Balaji V wrote:
> >
> > On a broad note, I think there's a lot of OpenMP code you could be
> > reusing here rather than writing it all again.  And that way Cilk code
> > will benefit from improvements to OpenMP handling, and vice versa.  It
> > probably makes sense to turn Cilk_for into an OMP_FOR loop, and then
> > gimplify into GIMPLE_OMP_FOR, rather than create a new tree code and
> > handle everything at the tree level.  But I don't know the OMP code
> > well enough to suggest exactly how that would work.
> That's certainly the direction I'd like to see this work go as well.  To the fullest
> extent possible Cilk+ should be layering on top of the OpenMP 4 work -- ie,
> Cilk+ should really be dealing with parsing issues, then handoff to OpenMP
> for the real work.

Hello Jeff and Jason,
	I completely agree with you that there are certain parts of Cilk Plus that is similar to OMP4, namely #pragma simd and SIMD-enabled functions (formerly called elemental functions). But, the Cilk keywords is almost completely orthogonal to OpenMP. They are semantically different  and one cannot be transformed to another. Cilk uses automatically load-balanced work-stealing using the Cilk runtime, whereas OMP uses work sharing via OMP runtime. There are a number of other semantic differences but this is the core-issue. #pragma simd and #pragma omp have converged in several places but the Cilk part has always been different from OpenMP.

	I have thought about sharing routines with OpenMP and have done it in several parts of Cilk plus.  It is not possible to share any middle end work between Cilk keywords and OpenMP because they are fundamentally different. I have shared some parsing parts with omp  in C.

 	Since we are talking about _Cilk_for loops, maybe an example of how a compiler is supposed break down a _Cilk_for loop will help. Please see the example below. It is a simple main routine with one _Cilk_for in it and it returns a local variable X that may or may not be read/written in the body:


int main (void)
{
	int X = 0;
	_Cilk_for (int ii = 5; ii < 15; ii++)
	{
		<body>
	}
	return X;
}

This program is converted to the following:

/* Low and high fields are passed in by the runtime using the user defined grainsize or the rumtime     
     computed one. Data field is ignored in GCC, please see below.  */

cilk_for_helper_function  (void *data, int low, int high) {
	for (ii = low; ii < high; ii++)
		<body>;
}

int main (void)
{
	int X = 0;
	/* This function is actually a call the the runtime whose implementation is in 		  	      libcilkrts/runtime/cilk-abi-cilk-for.c.  */
	__cilkrts_cilk_for_64 	(__cilk_for_001,   /* Nested/Lambda function */
				  __cilk_for_001,   /* Data used by the lambda function, the runtime
                                                                                                          does not worry about it.  It is an interface to 
                                                                                                          pass the information to the lambda function. In           
                                                                                                          GCC we create a nested function so it is 
                                                                                                           ignored.  */
				10                                /* loop_count (15-5) */, 
				0                                  /* grain value from the 
                                                                                                         #pragma grainsize pragma */   );

	
	/* Note: if the trip-count is 32 bit then __cilkrts_cilk_for_64 is replaced by 			     __cilkrts_cilk_for_32  */
	
	return X;
}


As you can tell, this is not how openmp handles a #pragma omp for loop.

Thanks,

Balaji V. Iyer.

> 
> Jeff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-19  1:11   ` Iyer, Balaji V
  2013-11-22 19:45     ` Jason Merrill
@ 2013-11-27 23:55     ` Jeff Law
  1 sibling, 0 replies; 42+ messages in thread
From: Jeff Law @ 2013-11-27 23:55 UTC (permalink / raw)
  To: Iyer, Balaji V, Aldy Hernandez
  Cc: gcc-patches, Jason Merrill (jason@redhat.com), rth

On 11/18/13 14:50, Iyer, Balaji V wrote:
>
> 	Attached, please find a refreshed patches (one for C and 1 for C++).  The trunk was "diffed" after Aldy's check in of pragma simd was in. So, now this patch is only dependent on _Cilk_spawn and _Cilk_sync (mostly for execution of tests). They are tested on x86_64 and works successfully.
>
> Here are the fixed Changelog entries (C related changelogs are given first then C++):
I'm just getting started on the C stuff.  This likely will be a partial 
review.

First, it looks like you've got a ton of unrelated cleanup work going on 
in c-family/cilk.c.  Function/parameter renaming and the like.  If this 
stuff is important, can it go forward independently?  It's certainly 
hard to find Cilk_for stuff in there with all the unrelated changes 
going on.  Some of the stuff seems to have moved into cilk.h which is 
probably good too, but probably should be separated from the Cilk_for 
support patch as well.


The semantics of cilk_check_loop_difference_type seem a bit odd.  Can 
you explain a bit more how those semantics were choosen?   And given the 
implementation, isn't it impossible for it to return NULL_TREE, meaning 
that check in c_extract_cilk_for_fields is pointless?

Isn't cilk_simplify_tree just used in the validation code?  Can it be 
moved into that file and privatized?  It probably needs a better name 
too.  I have to also admit, it looks pretty bogus to just start 
stripping things like that.  Oh yea, declaring and using tree_ssa_... is 
not OK there, thus it's not ok to use STRIP_USELESS_TYPE_CONVERSION.

In general, could you do a pass over those functions with external 
linkage and if they're only used in one file, move them to that file and 
privatize them?

So Cilk_for has a type?  That's the impression I get by looking at 
cilk_loop_convert.  In reality, I think it's just poorly named.  I've 
got to believe there's something, somewhere that will do equivalent 
conversions for you :-)

There's a mismatch between direction tracking between 
cilk_calc_forward_div_op and cilk_compute_incr_direction.  THe former 
handles -1, the latter never returns it.  I'd probably prefer to see the 
direction information represented as an ENUM anyway.
.
Presumably you're using nested functions to encapsulate the body for the 
Cilk_for loop?  Does the nested function have access to the parent's 
context?

If the verification routines have to stay, then isn't there a lot of 
commonality between (for example) cilk_set_incr_info and 
c_check_cilk_loop_incr?  Anything worth refactoring there?  Probably not 
I guess

I must have missed something -- when can we get a CONTINUE_STMT outside 
a loop body?  See cilk_resolve_continue_stmts.


In c-common.h, you added several prototypes for things in c-cilkplus.c. 
  We're trying to avoid doing that, instead favoring putting them into 
c-cilkplus.h and including that where needed.

Similarly in c-tree.h



Is there any significant sharable code between parsing a normal C for 
statement a a Cilk_for that we could factor out?  What about the 
validation code?  Can any of that be shared with OpenMP?  What about the 
gimplification?

Are the syntatical differences so significant that we can't share any of 
that code?

Any particular reason why the runtime uses different names for the 32bit 
and 64bit Clik_for support?  ISTM you wouldn't ever be binding the 32bit 
and 64bit runtimes into a single application, so you could have just 
called it __cilkrts_cilk_for and be done with it.  I guess it's too late 
to change that now.  Or does the 32/64 split refer to the # bits 
necessary to hold the iteration counter?

vertical whitespace removed after the cilk_arrow function, causing it to 
end up immediately adjacent to the comment for the next function.  I 
suspect this was unintentional.


You define CILK_FOR_STMT and new gimplification routines to handle it. 
Is there any way you can funnel all this through the OpenMP 4 support?

It looks like you've got unrelated cleanups going on in c-family/cilk.c. 
  Function renaming and such.  Is that strictly related to Cilk_for 
support?  Much of it looks independent of Cilk_for support.  Seems to me 
like that should break out and go forward independently.

I'm going to have to look at this some more, but I wanted to give you as 
much feedback as I could before I disappear for the holidays.


jeff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-27 23:04             ` Iyer, Balaji V
@ 2013-11-28  8:31               ` Jason Merrill
  2013-12-03  6:30                 ` Jeff Law
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Merrill @ 2013-11-28  8:31 UTC (permalink / raw)
  To: Iyer, Balaji V, Jeff Law, Aldy Hernandez; +Cc: gcc-patches, rth, Jakub Jelinek

On 11/27/2013 04:14 PM, Iyer, Balaji V wrote:
> 	I completely agree with you that there are certain parts of Cilk Plus that is similar to OMP4, namely #pragma simd and SIMD-enabled functions (formerly called elemental functions). But, the Cilk keywords is almost completely orthogonal to OpenMP. They are semantically different  and one cannot be transformed to another. Cilk uses automatically load-balanced work-stealing using the Cilk runtime, whereas OMP uses work sharing via OMP runtime. There are a number of other semantic differences but this is the core-issue. #pragma simd and #pragma omp have converged in several places but the Cilk part has always been different from OpenMP.

Yes, Cilk for loops will use the Cilk runtime and OMP for loops will use 
the OMP runtime, but that doesn't mean they can't share a lot of the 
middle end code along the way.

We already have several different varieties of parallel/simd loops all 
represented by GIMPLE_OMP_FOR, and I think this could be another 
GP_OMP_FOR_KIND_.

...
> As you can tell, this is not how openmp handles a #pragma omp for loop.

It's different in detail, but #pragma omp parallel for works very 
similarly: it creates a separate function for the body of the loop and 
then passes that to GOMP_parallel along with any shared data.

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-11-28  8:31               ` Jason Merrill
@ 2013-12-03  6:30                 ` Jeff Law
  2013-12-03 13:26                   ` Iyer, Balaji V
  0 siblings, 1 reply; 42+ messages in thread
From: Jeff Law @ 2013-12-03  6:30 UTC (permalink / raw)
  To: Jason Merrill, Iyer, Balaji V, Aldy Hernandez
  Cc: gcc-patches, rth, Jakub Jelinek

On 11/27/13 17:52, Jason Merrill wrote:
> On 11/27/2013 04:14 PM, Iyer, Balaji V wrote:
>>     I completely agree with you that there are certain parts of Cilk
>> Plus that is similar to OMP4, namely #pragma simd and SIMD-enabled
>> functions (formerly called elemental functions). But, the Cilk
>> keywords is almost completely orthogonal to OpenMP. They are
>> semantically different  and one cannot be transformed to another. Cilk
>> uses automatically load-balanced work-stealing using the Cilk runtime,
>> whereas OMP uses work sharing via OMP runtime. There are a number of
>> other semantic differences but this is the core-issue. #pragma simd
>> and #pragma omp have converged in several places but the Cilk part has
>> always been different from OpenMP.
>
> Yes, Cilk for loops will use the Cilk runtime and OMP for loops will use
> the OMP runtime, but that doesn't mean they can't share a lot of the
> middle end code along the way.
>
> We already have several different varieties of parallel/simd loops all
> represented by GIMPLE_OMP_FOR, and I think this could be another
> GP_OMP_FOR_KIND_.
Right.  It's not a question of what runtime they call back into, but 
that both share a common overall structure.

Conceptually I look at a for loop as having 4 main components.

Initializer, test condition, increment and the body.

I'd like to hope things like the syntatic & semantic analysis of the 
first three would be largely the same.  Most of the Cilk specific bits 
would be in the handling of the body -- but there may be some 
significant code sharing that can happen there too.


>
> ...
>> As you can tell, this is not how openmp handles a #pragma omp for loop.
>
> It's different in detail, but #pragma omp parallel for works very
> similarly: it creates a separate function for the body of the loop and
> then passes that to GOMP_parallel along with any shared data.
My thoughts exactly.
jeff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2013-12-03  6:30                 ` Jeff Law
@ 2013-12-03 13:26                   ` Iyer, Balaji V
  2013-12-03 13:40                     ` Jakub Jelinek
                                       ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Iyer, Balaji V @ 2013-12-03 13:26 UTC (permalink / raw)
  To: Jeff Law, Jason Merrill, Aldy Hernandez; +Cc: gcc-patches, rth, Jakub Jelinek



> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: Tuesday, December 3, 2013 1:30 AM
> To: Jason Merrill; Iyer, Balaji V; Aldy Hernandez
> Cc: gcc-patches@gcc.gnu.org; rth@redhat.com; Jakub Jelinek
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On 11/27/13 17:52, Jason Merrill wrote:
> > On 11/27/2013 04:14 PM, Iyer, Balaji V wrote:
> >>     I completely agree with you that there are certain parts of Cilk
> >> Plus that is similar to OMP4, namely #pragma simd and SIMD-enabled
> >> functions (formerly called elemental functions). But, the Cilk
> >> keywords is almost completely orthogonal to OpenMP. They are
> >> semantically different  and one cannot be transformed to another.
> >> Cilk uses automatically load-balanced work-stealing using the Cilk
> >> runtime, whereas OMP uses work sharing via OMP runtime. There are a
> >> number of other semantic differences but this is the core-issue.
> >> #pragma simd and #pragma omp have converged in several places but the
> >> Cilk part has always been different from OpenMP.
> >
> > Yes, Cilk for loops will use the Cilk runtime and OMP for loops will
> > use the OMP runtime, but that doesn't mean they can't share a lot of
> > the middle end code along the way.
> >
> > We already have several different varieties of parallel/simd loops all
> > represented by GIMPLE_OMP_FOR, and I think this could be another
> > GP_OMP_FOR_KIND_.
> Right.  It's not a question of what runtime they call back into, but that both
> share a common overall structure.
> 
> Conceptually I look at a for loop as having 4 main components.
> 
> Initializer, test condition, increment and the body.
> 
> I'd like to hope things like the syntatic & semantic analysis of the first three
> would be largely the same.  Most of the Cilk specific bits would be in the
> handling of the body -- but there may be some significant code sharing that
> can happen there too.
> 
> 
> >
> > ...
> >> As you can tell, this is not how openmp handles a #pragma omp for loop.
> >
> > It's different in detail, but #pragma omp parallel for works very
> > similarly: it creates a separate function for the body of the loop and
> > then passes that to GOMP_parallel along with any shared data.
> My thoughts exactly.

I understand you both now. Let me look into the OMP routines and see what it is doing and see how I can port it to _Cilk_for. 

Thanks,

Balaji V. Iyer.

> jeff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-12-03 13:26                   ` Iyer, Balaji V
@ 2013-12-03 13:40                     ` Jakub Jelinek
  2013-12-03 14:01                       ` Iyer, Balaji V
  2013-12-03 19:44                     ` Jeff Law
  2013-12-16  0:40                     ` Iyer, Balaji V
  2 siblings, 1 reply; 42+ messages in thread
From: Jakub Jelinek @ 2013-12-03 13:40 UTC (permalink / raw)
  To: Iyer, Balaji V; +Cc: Jeff Law, Jason Merrill, Aldy Hernandez, gcc-patches, rth

On Tue, Dec 03, 2013 at 01:25:57PM +0000, Iyer, Balaji V wrote:
> > >> As you can tell, this is not how openmp handles a #pragma omp for loop.
> > >
> > > It's different in detail, but #pragma omp parallel for works very
> > > similarly: it creates a separate function for the body of the loop and
> > > then passes that to GOMP_parallel along with any shared data.
> > My thoughts exactly.
> 
> I understand you both now. Let me look into the OMP routines and see what
> it is doing and see how I can port it to _Cilk_for.

Yeah.  The work is actually multi-stage, first during gimplification
the code does determine what variables are used in the #pragma omp parallel
(etc., in your case _Cilk_for) region, and whether they should be shared,
or privatized (and in that case in what way, normal private, firstprivate,
lastprivate, firstprivate+lastprivate, reduction, ...).  Then there is
omplower pass (already enabled for Cilk+ due to #pragma simd) that e.g.
lowers accesses to shared variables, creates new VAR_DECLs for the
privatized vars etc. and then ompexp pass that will create the outlined body
of the function and create call to the runtime library.
I have no idea what privatization behavior _Cilk_for wants, I'd expect that
at least the IV must be privatized, otherwise it would be racy, but about
other vars?

	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2013-12-03 13:40                     ` Jakub Jelinek
@ 2013-12-03 14:01                       ` Iyer, Balaji V
  2013-12-03 14:10                         ` Jakub Jelinek
  0 siblings, 1 reply; 42+ messages in thread
From: Iyer, Balaji V @ 2013-12-03 14:01 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jeff Law, Jason Merrill, Aldy Hernandez, gcc-patches, rth



> -----Original Message-----
> From: Jakub Jelinek [mailto:jakub@redhat.com]
> Sent: Tuesday, December 3, 2013 8:40 AM
> To: Iyer, Balaji V
> Cc: Jeff Law; Jason Merrill; Aldy Hernandez; gcc-patches@gcc.gnu.org;
> rth@redhat.com
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On Tue, Dec 03, 2013 at 01:25:57PM +0000, Iyer, Balaji V wrote:
> > > >> As you can tell, this is not how openmp handles a #pragma omp for
> loop.
> > > >
> > > > It's different in detail, but #pragma omp parallel for works very
> > > > similarly: it creates a separate function for the body of the loop
> > > > and then passes that to GOMP_parallel along with any shared data.
> > > My thoughts exactly.
> >
> > I understand you both now. Let me look into the OMP routines and see
> > what it is doing and see how I can port it to _Cilk_for.
> 
> Yeah.  The work is actually multi-stage, first during gimplification the code
> does determine what variables are used in the #pragma omp parallel (etc., in
> your case _Cilk_for) region, and whether they should be shared, or
> privatized (and in that case in what way, normal private, firstprivate,
> lastprivate, firstprivate+lastprivate, reduction, ...).  Then there is omplower
> pass (already enabled for Cilk+ due to #pragma simd) that e.g.
> lowers accesses to shared variables, creates new VAR_DECLs for the
> privatized vars etc. and then ompexp pass that will create the outlined body
> of the function and create call to the runtime library.
> I have no idea what privatization behavior _Cilk_for wants, I'd expect that at
> least the IV must be privatized, otherwise it would be racy, but about other
> vars?
> 

In Cilk_for you don't need to privatize any variables. I need to pass in the loop_count, the grain (if the user specifies one), the nested function and its context to a Cilk specific function:__cilkrts_cilk_for_64 (or 32). The nested function has the body of the _Cilk_for and it has 3 parameter, context, start and end, where the start and end are passed in by the runtime which will tell what parts of the loop to execute. This thread has an example: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03567.html

Thanks,

Balaji V. Iyer.

> 	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-12-03 14:01                       ` Iyer, Balaji V
@ 2013-12-03 14:10                         ` Jakub Jelinek
  0 siblings, 0 replies; 42+ messages in thread
From: Jakub Jelinek @ 2013-12-03 14:10 UTC (permalink / raw)
  To: Iyer, Balaji V; +Cc: Jeff Law, Jason Merrill, Aldy Hernandez, gcc-patches, rth

On Tue, Dec 03, 2013 at 02:01:17PM +0000, Iyer, Balaji V wrote:
> In Cilk_for you don't need to privatize any variables. I need to pass in
> the loop_count, the grain (if the user specifies one), the nested function
> and its context to a Cilk specific function:__cilkrts_cilk_for_64 (or 32). 
> The nested function has the body of the _Cilk_for and it has 3 parameter,
> context, start and end, where the start and end are passed in by the
> runtime which will tell what parts of the loop to execute.  This thread
> has an example: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03567.html

So Cilk+ only allows say:
	_Cilk_for (int ii = 5; ii < 15; ii++)
	{
		<body>
	}
and not
	int ii;
	_Cilk_for (ii = 5; ii < 15; ii++)
	{
		<body>
	}
?  Other variables can be all shared, that is after all the default
for #pragma omp parallel for, except for the IVs and a couple of other
exceptions (e.g. readonly vars etc.), if somebody wants private vars,
they can be surely declared inside of the <body> somwhere.

	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-12-03 13:26                   ` Iyer, Balaji V
  2013-12-03 13:40                     ` Jakub Jelinek
@ 2013-12-03 19:44                     ` Jeff Law
  2013-12-16  0:40                     ` Iyer, Balaji V
  2 siblings, 0 replies; 42+ messages in thread
From: Jeff Law @ 2013-12-03 19:44 UTC (permalink / raw)
  To: Iyer, Balaji V, Jason Merrill, Aldy Hernandez
  Cc: gcc-patches, rth, Jakub Jelinek

On 12/03/13 06:25, Iyer, Balaji V wrote:

>
> I understand you both now. Let me look into the OMP routines and see what it is doing and see how I can port it to _Cilk_for.
Thanks.  I know it's a bit of a pain, but part what's driving the desire 
to share is to reduce the long term maintenance cost for everyone.

jeff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2013-12-03 13:26                   ` Iyer, Balaji V
  2013-12-03 13:40                     ` Jakub Jelinek
  2013-12-03 19:44                     ` Jeff Law
@ 2013-12-16  0:40                     ` Iyer, Balaji V
  2013-12-16 21:21                       ` Jason Merrill
  2 siblings, 1 reply; 42+ messages in thread
From: Iyer, Balaji V @ 2013-12-16  0:40 UTC (permalink / raw)
  To: 'Jeff Law', 'Jason Merrill', 'Aldy Hernandez'
  Cc: 'gcc-patches@gcc.gnu.org', 'rth@redhat.com',
	'Jakub Jelinek'

[-- Attachment #1: Type: text/plain, Size: 7773 bytes --]



> -----Original Message-----
> From: Iyer, Balaji V
> Sent: Tuesday, December 3, 2013 8:26 AM
> To: Jeff Law; Jason Merrill; Aldy Hernandez
> Cc: gcc-patches@gcc.gnu.org; rth@redhat.com; Jakub Jelinek
> Subject: RE: [PATCH] _Cilk_for for C and C++
> 
> 
> 
> > -----Original Message-----
> > From: Jeff Law [mailto:law@redhat.com]
> > Sent: Tuesday, December 3, 2013 1:30 AM
> > To: Jason Merrill; Iyer, Balaji V; Aldy Hernandez
> > Cc: gcc-patches@gcc.gnu.org; rth@redhat.com; Jakub Jelinek
> > Subject: Re: [PATCH] _Cilk_for for C and C++
> >
> > On 11/27/13 17:52, Jason Merrill wrote:
> > > On 11/27/2013 04:14 PM, Iyer, Balaji V wrote:
> > >>     I completely agree with you that there are certain parts of
> > >> Cilk Plus that is similar to OMP4, namely #pragma simd and
> > >> SIMD-enabled functions (formerly called elemental functions). But,
> > >> the Cilk keywords is almost completely orthogonal to OpenMP. They
> > >> are semantically different  and one cannot be transformed to another.
> > >> Cilk uses automatically load-balanced work-stealing using the Cilk
> > >> runtime, whereas OMP uses work sharing via OMP runtime. There are a
> > >> number of other semantic differences but this is the core-issue.
> > >> #pragma simd and #pragma omp have converged in several places but
> > >> the Cilk part has always been different from OpenMP.
> > >
> > > Yes, Cilk for loops will use the Cilk runtime and OMP for loops will
> > > use the OMP runtime, but that doesn't mean they can't share a lot of
> > > the middle end code along the way.
> > >
> > > We already have several different varieties of parallel/simd loops
> > > all represented by GIMPLE_OMP_FOR, and I think this could be another
> > > GP_OMP_FOR_KIND_.
> > Right.  It's not a question of what runtime they call back into, but
> > that both share a common overall structure.
> >
> > Conceptually I look at a for loop as having 4 main components.
> >
> > Initializer, test condition, increment and the body.
> >
> > I'd like to hope things like the syntatic & semantic analysis of the
> > first three would be largely the same.  Most of the Cilk specific bits
> > would be in the handling of the body -- but there may be some
> > significant code sharing that can happen there too.
> >
> >
> > >
> > > ...
> > >> As you can tell, this is not how openmp handles a #pragma omp for
> loop.
> > >
> > > It's different in detail, but #pragma omp parallel for works very
> > > similarly: it creates a separate function for the body of the loop
> > > and then passes that to GOMP_parallel along with any shared data.
> > My thoughts exactly.
> 
> I understand you both now. Let me look into the OMP routines and see what
> it is doing and see how I can port it to _Cilk_for.
> 

Hello Jeff and Jason,
	Attached, please find a patch that will implement _Cilk_for using OMP routines. Jason, this patch only handles C but there are a lot of middle-end changes (front-end C only changes is minimal) in omp-low.c that will affect the C++ patch also.  

	It passes all the tests on my x86_64 SUSE box for -m64 and -m32 mode. Is this OK for trunk?

	The C++ patch is coming up shortly.

Here are the ChangeLog entries:

Gcc/ChangeLog
2013-12-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

        * cilk-builtins.def: Added two new builtin functions called
        __cilkrts_cilk_for_32 and __cilkrts_cilk_for_64.
        * cilk-common.c (cilk_init_builtins): Likewise.
        (cilk_declare_looper): New function.
        * cilk.h (enum cilk_tree_index): Added two new fields called
        CILK_TI_F_LOOP_32 and CILK_TI_F_LOOP_64.
        (cilk_for_32_fndecl): New #define.
        (cilk_for_64_fndecl): Likewise.
        * gimple-pretty-print.c (dump_gimple_omp_for): Added a new case for
        GF_OMP_FOR_KIND_CILKFOR.  Also emitted "_Cilk_for" instead of "for ("
        for when the gimple kind is GF_OMP_FOR_KIND_CILKFOR.
        * gimple.h (enum gf_mask): Added a new field GF_OMP_FOR_KIND_CILKFOR.
        Re-arranged couple other fields to make them all in ascending order.
        (struct gimple_omp_for_iter): Added 2 new fields: loop_count, grain.
        (gimple_statement_omp_parallel_layout): Likewise.
        (gimple_omp_for_combined_p): Added a check for combined instead of
        an logical and.
        (gimple_cilk_for_set_count): New function.
        (gimple_cilk_for_set_grain): Likewise.
        (set_cilk_for_parallel_loop_count): Likewise.
        (set_cilk_for_parallel_grain): Likewise.
        (gimple_cilk_for_parallel_get_loop_count): Likewise.
        (gimple_cilk_for_parallel_get_grain): Likewise.
        (gimple_cilk_for_induction_var): Likewise.
        (gimple_cilk_for_loop_count): Likewise.
        (gimple_cilk_for_grain): Likewise.
        * gimplify.c (cilk_for_compute_set_count_grain): Likewise.
        (gimplify_omp_for): Added code to handle gimplification of a _Cilk_for
        statement.
        * omp-low.c (struct cilk_for_information): New structure.
        (create_omp_child_function_name): Added a new bool parameter called
        is_cilk_for.  If this is set, then use a different suffix.
        (find_cilk_for_stmt): New function.
        (is_cilk_for_stmt): Likewise.
        (cilk_for_check_loop_diff_type): Likewise.
        (expand_cilk_for_call): Likewise.
        (expand_cilk_for): Likewise.
        (create_omp_child_function): Added a check to see the statment is a
        _Cilk_for.  If so, then create a child function with different
        number of parameters.
        (expand_omp_taskreg): Added code to extract the high and low parameters
        from the child function and then insert it in the appropriate location.
        Added a call to expand_cilk_for_call to insert _Cilk_for's builtin
        function call.
        (expand_omp_for): Added a check for GF_OMP_FOR_KIND_CILKFOR for the
        for statement's kind.  If so then call expand_cilk_for.
        * tree-core.h (enum omp_clause_schedule_kind): Added a new field
        OMP_CLAUSE_SCHEDULE_CILK_FOR.
        * tree.def (CILK_FOR): New tree.

Gcc/c-family/ChangeLog
2013-12-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

        * c-omp.c (c_finish_omp_for): Added a check for CILK_FOR along with
        CILK_SIMD.
        * c-common.h (enum rid): Added new value called "RID_CILK_FOR."
        * c-common.c (c_common_reswords[]): Added a new field "_Cilk_for."
        * c-pragma.c (init_pragma): Added cilk grainsize pragma.
        * c-pragma.h (enum pragma_kind): Added PRAGMA_CILK_GRAINSIZE.

Gcc/c/ChangeLog
2013-12-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

        * c-parser.c (c_parser_statement_after_labels): Added a RID_CILK_FOR
        case.
        (c_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
        (c_parser_omp_for_loop): Renamed the "clauses" parameter to
        "clauses_or_grain."  Added handling for _Cilk_for statements.  Set
        the grain value to the clauses location.
        (c_parser_cilk_grainsize): New function.
        (c_parser_cilk_simd): Added a new parameter called grain.  Also added
        support to parse _Cilk_for statements.

Gcc/testsuite/ChangeLog
2013-12-15  Balaji V. Iyer  <balaji.v.iyer@intel.com>

        * c-c++-common/cilk-plus/CK/cilk-fors.c: New testcase.
        * c-c++-common/cilk-plus/CK/nested_cilk_for.c: Likewise.
        * c-c++-common/cilk-plus/CK/cilk_for_grain.c: Likewise.
        * c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
        * c-c++-common/cilk-plus/CK/cilk_for_errors.c: Likewise.
        * c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c: Likewise.

Thanks,

Balaji V. Iyer.



[-- Attachment #2: diff.txt --]
[-- Type: text/plain, Size: 54840 bytes --]

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index cfaeaf0..c3dcb21
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -414,6 +414,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Complex",		RID_COMPLEX,	0 },
   { "_Cilk_spawn",      RID_CILK_SPAWN, 0 },
   { "_Cilk_sync",       RID_CILK_SYNC,  0 },
+  { "_Cilk_for",        RID_CILK_FOR,   0 },
   { "_Imaginary",	RID_IMAGINARY, D_CONLY },
   { "_Decimal32",       RID_DFLOAT32,  D_CONLY | D_EXT },
   { "_Decimal64",       RID_DFLOAT64,  D_CONLY | D_EXT },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 4357d1f..508de30
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -149,7 +149,7 @@ enum rid
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
   /* Cilk Plus keywords.  */
-  RID_CILK_SPAWN, RID_CILK_SYNC,
+  RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR,
   
   /* Objective-C ("AT" reserved words - they are only keywords when
      they follow '@')  */
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 3ccf8f9..6ae0a0f
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -386,7 +386,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
   bool fail = false;
   int i;
 
-  if (code == CILK_SIMD
+  if ((code == CILK_SIMD || code == CILK_FOR) 
       && !c_check_cilk_loop (locus, TREE_VEC_ELT (declv, 0)))
     fail = true;
 
@@ -516,7 +516,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 					    0))
 		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
 					 ? LT_EXPR : GE_EXPR);
-		  else if (code != CILK_SIMD)
+		  else if (code != CILK_SIMD && code != CILK_FOR)
 		    cond_ok = false;
 		}
 	    }
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index 64a5b66..9d6efc5
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1394,6 +1394,11 @@ init_pragma (void)
 
   cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false,
 				false);
+
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    cpp_register_deferred_pragma (parse_in, "cilk", "grainsize",
+				  PRAGMA_CILK_GRAINSIZE, true, false);
+
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, "pack", handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 5379b9e..ef62653
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -55,6 +55,9 @@ typedef enum pragma_kind {
   /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
   PRAGMA_CILK_SIMD,
 
+  /* This pragma handles setting of grainsize for a _Cilk_for.  */
+  PRAGMA_CILK_GRAINSIZE,
+
   PRAGMA_GCC_PCH_PREPROCESS,
   PRAGMA_IVDEP,
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index c78d269..a02fd4f
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1242,9 +1242,10 @@ static bool c_parser_objc_diagnose_bad_element_prefix
   (c_parser *, struct c_declspecs *);
 
 /* Cilk Plus supporting routines.  */
-static void c_parser_cilk_simd (c_parser *);
+static void c_parser_cilk_simd (c_parser *, bool, tree);
 static bool c_parser_cilk_verify_simd (c_parser *, enum pragma_context);
 static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
+static void c_parser_cilk_grainsize (c_parser *);
 
 /* Parse a translation unit (C90 6.7, C99 6.9).
 
@@ -4776,6 +4777,16 @@ c_parser_statement_after_labels (c_parser *parser)
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
 	  break;
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (c_parser_peek_token (parser)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      c_parser_skip_to_end_of_block_or_statement (parser);
+	    }
+	  else
+	    c_parser_cilk_simd (parser, true, integer_zero_node);
+	  break;
 	case RID_CILK_SYNC:
 	  c_parser_consume_token (parser);
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
@@ -9386,7 +9397,24 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       if (!c_parser_cilk_verify_simd (parser, context))
 	return false;
       c_parser_consume_pragma (parser);
-      c_parser_cilk_simd (parser);
+      c_parser_cilk_simd (parser, false, NULL_TREE);
+      return false;
+    case PRAGMA_CILK_GRAINSIZE:
+      if (!flag_enable_cilkplus)
+	{
+	  warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not"
+		   " enabled");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      if (context == pragma_external)
+	{
+	  error_at (c_parser_peek_token (parser)->location,
+		    "%<#pragma grainsize%> must be inside a function");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      c_parser_cilk_grainsize (parser);
       return false;
 
     default:
@@ -11460,7 +11488,7 @@ c_parser_omp_flush (c_parser *parser)
 
 static tree
 c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
-		       tree clauses, tree *cclauses)
+		       tree clauses_or_grain, tree *cclauses)
 {
   tree decl, cond, incr, save_break, save_cont, body, init, stmt, cl;
   tree declv, condv, incrv, initv, ret = NULL;
@@ -11468,6 +11496,8 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   int i, collapse = 1, nbraces = 0;
   location_t for_loc;
   vec<tree, va_gc> *for_block = make_tree_vector ();
+  tree clauses = code == CILK_FOR ? NULL_TREE : clauses_or_grain;
+  tree grain = code == CILK_FOR ? clauses_or_grain : NULL_TREE;
 
   for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl))
     if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE)
@@ -11480,11 +11510,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   condv = make_tree_vec (collapse);
   incrv = make_tree_vec (collapse);
 
-  if (!c_parser_next_token_is_keyword (parser, RID_FOR))
+  if (code != CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_FOR))
     {
       c_parser_error (parser, "for statement expected");
       return NULL;
     }
+  if (code == CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_CILK_FOR))
+    {
+      c_parser_error (parser, "_Cilk_for statement expected");
+      return NULL;
+    }
   for_loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
 
@@ -11736,6 +11773,11 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 		  }
 	    }
 	  OMP_FOR_CLAUSES (stmt) = clauses;
+	  /* If it is a _Cilk_for statement, then the OMP_FOR_CLAUSES location
+	     stores the user-defined grain value or an integer_zero_node 
+	     indicating that the runtime must compute a suitable grain.  */
+	  if (code == CILK_FOR)
+	    OMP_FOR_CLAUSES (stmt) = grain;
 	}
       ret = stmt;
     }
@@ -13566,18 +13608,75 @@ c_parser_cilk_all_clauses (c_parser *parser)
   return c_finish_cilk_clauses (clauses);
 }
 
+/* This function helps parse the grainsize pragma for a _Cilk_for statement. 
+   Here is the correct syntax of this pragma: 
+	    #pragma cilk grainsize = <EXP> 
+ */
+
+static void
+c_parser_cilk_grainsize (c_parser *parser)
+{
+  extern tree convert_to_integer (tree, tree);
+
+  /* consume the 'grainsize' keyword.  */
+  c_parser_consume_pragma (parser);
+
+  if (c_parser_require (parser, CPP_EQ, "expected %<=%>") != 0)
+    {
+      struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL);
+      if (g_expr.value && TREE_CODE (g_expr.value) == C_MAYBE_CONST_EXPR)
+	{
+	  error_at (input_location, "cannot convert grain to long integer.\n");
+	  c_parser_skip_to_pragma_eol (parser);
+	}   
+      else if (g_expr.value && g_expr.value != error_mark_node)
+	{
+	  c_parser_skip_to_pragma_eol (parser);
+	  c_token *token = c_parser_peek_token (parser);
+	  if (token && token->type == CPP_KEYWORD
+	      && token->keyword == RID_CILK_FOR)
+	    {
+	      tree grain = convert_to_integer (long_integer_type_node,
+					       g_expr.value);
+	      if (grain && grain != error_mark_node) 
+		c_parser_cilk_simd (parser, true, grain);
+	    }
+	  else
+	    warning (0, "grainsize pragma is not followed by %<_Cilk_for%>");
+	}
+      else
+	c_parser_skip_to_pragma_eol (parser);
+    }
+  else
+    c_parser_skip_to_pragma_eol (parser);
+}
+
 /* Main entry point for parsing Cilk Plus <#pragma simd> for
    loops.  */
 
 static void
-c_parser_cilk_simd (c_parser *parser)
+c_parser_cilk_simd (c_parser *parser, bool is_cilk_for, tree grain)
 {
-  tree clauses = c_parser_cilk_all_clauses (parser);
+  tree super_block = NULL_TREE;
+  tree clauses = NULL_TREE;
+  
+  if (!is_cilk_for)
+    clauses = c_parser_cilk_all_clauses (parser);
+  else
+    { 
+      super_block = c_begin_omp_parallel ();
+      clauses = grain;
+    }
   tree block = c_begin_compound_stmt (true);
   location_t loc = c_parser_peek_token (parser)->location;
-  c_parser_omp_for_loop (loc, parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  c_parser_omp_for_loop (loc, parser, code, clauses, NULL);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
+  if (is_cilk_for) 
+    /* The term super_block is not used in scheduling terms but in 
+       set-theory, i.e. set vs. super-set.  */ 
+    c_finish_omp_parallel (loc, NULL_TREE, super_block);
 }
 \f
 /* Parse a transaction attribute (GCC Extension).
diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def
index 8634194..7f8f97a
--- a/gcc/cilk-builtins.def
+++ b/gcc/cilk-builtins.def
@@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64")
diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c
index 52b3785..2574f12
--- a/gcc/cilk-common.c
+++ b/gcc/cilk-common.c
@@ -106,6 +106,26 @@ install_builtin (const char *name, tree fntype, enum built_in_function code,
   return fndecl;
 }
 
+/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
+
+static tree
+cilk_declare_looper (const char *name, tree type, enum built_in_function code)
+{
+  tree cb, ft, fn;
+
+  cb = build_function_type_list (void_type_node,
+                                 ptr_type_node, type, type,
+                                 NULL_TREE);
+  cb = build_pointer_type (cb);
+  ft = build_function_type_list (void_type_node,
+                                 cb, ptr_type_node, type,
+                                 integer_type_node, NULL_TREE);
+  fn = install_builtin (name, ft, code, false);
+  TREE_NOTHROW (fn) = 0;
+
+  return fn;
+}
+
 /* Creates and initializes all the built-in Cilk keywords functions and three
    structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker.
    Detailed information about __cilkrts_stack_frame and
@@ -269,6 +289,15 @@ cilk_init_builtins (void)
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 
 					 fptr_fun, BUILT_IN_CILK_SAVE_FP,
 					 false);
+  /* __cilkrts_cilk_for_32 (...);  */
+  cilk_for_32_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_32",
+                                            unsigned_intSI_type_node,
+                                            BUILT_IN_CILK_FOR_32);
+  /* __cilkrts_cilk_for_64 (...);  */
+  cilk_for_64_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_64",
+                                            unsigned_intDI_type_node,
+                                            BUILT_IN_CILK_FOR_64);
+
 }
 
 /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR.  */
diff --git a/gcc/cilk.h b/gcc/cilk.h
index e990992..ea9f6ff
--- a/gcc/cilk.h
+++ b/gcc/cilk.h
@@ -40,6 +40,9 @@ enum cilk_tree_index  {
   CILK_TI_F_POP,                      /* __cilkrts_pop_frame (...).  */
   CILK_TI_F_RETHROW,                  /* __cilkrts_rethrow (...).  */
   CILK_TI_F_SAVE_FP,                  /* __cilkrts_save_fp_ctrl_state (...).  */
+  CILK_TI_F_LOOP_32,                  /* __cilkrts_cilk_for_32 (...).  */
+  CILK_TI_F_LOOP_64,                  /* __cilkrts_cilk_for_64 (...).  */
+
   /* __cilkrts_stack_frame struct fields.  */
   CILK_TI_FRAME_FLAGS,                /* stack_frame->flags.  */
   CILK_TI_FRAME_PARENT,               /* stack_frame->parent.  */
@@ -77,6 +80,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 #define cilk_rethrow_fndecl           cilk_trees[CILK_TI_F_RETHROW]
 #define cilk_pop_fndecl               cilk_trees[CILK_TI_F_POP]
 #define cilk_save_fp_fndecl           cilk_trees[CILK_TI_F_SAVE_FP]
+#define cilk_for_32_fndecl            cilk_trees[CILK_TI_F_LOOP_32]
+#define cilk_for_64_fndecl            cilk_trees[CILK_TI_F_LOOP_64]
 
 #define cilk_worker_type_fndecl       cilk_trees[CILK_TI_WORKER_TYPE]
 #define cilk_frame_type_decl          cilk_trees[CILK_TI_FRAME_TYPE]
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 42e3f5f..867aa52 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1158,6 +1158,8 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_DISTRIBUTE:
 	  pp_string (buffer, "#pragma omp distribute");
 	  break;
+	case GF_OMP_FOR_KIND_CILKFOR:
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -1167,7 +1169,10 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	  if (i)
 	    spc += 2;
 	  newline_and_indent (buffer, spc);
-	  pp_string (buffer, "for (");
+	  if (gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
+	    pp_string (buffer, "_Cilk_for (");
+	  else
+	    pp_string (buffer, "for (");
 	  dump_generic_node (buffer, gimple_omp_for_index (gs, i), spc,
 			     flags, false);
 	  pp_string (buffer, " = ");
diff --git a/gcc/gimple.h b/gcc/gimple.h
index a49016f..88f08d3
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -91,12 +91,13 @@ enum gf_mask {
     GF_CALL_ALLOCA_FOR_VAR	= 1 << 5,
     GF_CALL_INTERNAL		= 1 << 6,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
-    GF_OMP_FOR_KIND_MASK	= 3 << 0,
+    GF_OMP_FOR_KIND_MASK	= 7 << 0,
     GF_OMP_FOR_KIND_FOR		= 0 << 0,
     GF_OMP_FOR_KIND_DISTRIBUTE	= 1 << 0,
     GF_OMP_FOR_KIND_SIMD	= 2 << 0,
     GF_OMP_FOR_KIND_CILKSIMD	= 3 << 0,
     GF_OMP_FOR_COMBINED		= 1 << 2,
+    GF_OMP_FOR_KIND_CILKFOR     = 5 << 0,
     GF_OMP_FOR_COMBINED_INTO	= 1 << 3,
     GF_OMP_TARGET_KIND_MASK	= 3 << 0,
     GF_OMP_TARGET_KIND_REGION	= 0 << 0,
@@ -523,6 +524,12 @@ struct GTY(()) gimple_omp_for_iter {
 
   /* Increment.  */
   tree incr;
+
+  /* Loop count, only used by _Cilk_for.  */
+  tree loop_count;
+
+  /* Grain value, only used by _Cilk_for.  */
+  tree grain;
 };
 
 /* GIMPLE_OMP_FOR */
@@ -4298,7 +4305,7 @@ static inline bool
 gimple_omp_for_combined_p (const_gimple g)
 {
   GIMPLE_CHECK (g, GIMPLE_OMP_FOR);
-  return (gimple_omp_subcode (g) & GF_OMP_FOR_COMBINED) != 0;
+  return (gimple_omp_for_kind (g) == GF_OMP_FOR_COMBINED);
 }
 
 
@@ -4562,6 +4569,58 @@ gimple_omp_for_set_pre_body (gimple gs, gimple_seq pre_body)
   omp_for_stmt->pre_body = pre_body;
 }
 
+/* Set COUNT to be the loop count value for OMP_FOR GS.  */
+
+static inline void
+gimple_cilk_for_set_count (tree count, gimple gs)
+{
+  const gimple_statement_omp_for *omp_for_stmt =
+    as_a <gimple_statement_omp_for> (gs);
+  omp_for_stmt->iter[0].loop_count = count;
+}
+
+/* Set GRAIN to be the grain value used by Cilk runtime for OMP_FOR GS.  */
+
+static inline void
+gimple_cilk_for_set_grain (tree grain, gimple gs)
+{
+  const gimple_statement_omp_for *omp_for_stmt =
+    as_a <gimple_statement_omp_for> (gs);
+  omp_for_stmt->iter[0].grain = grain;
+}
+
+/* Returns the induction variable of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_induction_var (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->index;
+}
+
+/* Returns the LOOP_COUNT value of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_loop_count (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->loop_count;
+}
+
+/* Returns the GRAIN value of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_grain (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->grain;
+}
 
 /* Return the clauses associated with OMP_PARALLEL GS.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 1ca847a..bcc5ede5 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6545,6 +6545,26 @@ find_combined_omp_for (tree *tp, int *walk_subtrees, void *)
   return NULL_TREE;
 }
 
+/* Computes the loop count, absolute value of  (FINAL-INIT)/STEP and store
+   them in CFOR->ITER->LOOP_COUNT.  GRAIN is stored in CFOR->ITER->GRAIN.  */
+
+static void
+cilk_for_compute_set_count_grain (gimple cfor, tree init, tree final, 
+				  tree incr, tree grain)
+{
+  enum tree_code cond = gimple_omp_for_cond (cfor, 0);
+  tree type = TREE_TYPE (init);
+  tree m = fold_build2 (MINUS_EXPR, type, final, init);
+  if (cond == GT_EXPR || cond == GE_EXPR)
+    m = fold_build1 (NEGATE_EXPR, TREE_TYPE (m), m);
+
+  tree t = fold_build2 (TRUNC_DIV_EXPR, type, m, incr);
+  if (cond == LE_EXPR || cond == GE_EXPR)
+    t = fold_build2 (PLUS_EXPR, type, t, build_one_cst (type));
+  gimple_cilk_for_set_count (t, cfor);
+  gimple_cilk_for_set_grain (grain, cfor);
+}
+
 /* Gimplify the gross structure of an OMP_FOR statement.  */
 
 static enum gimplify_status
@@ -6559,7 +6579,15 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
   bool simd;
   bitmap has_decl_expr = NULL;
 
+  tree grain = NULL_TREE;
+  tree orig_init = NULL_TREE, orig_cond = NULL_TREE, orig_incr = NULL_TREE;
   orig_for_stmt = for_stmt = *expr_p;
+  
+  if (TREE_CODE (for_stmt) == CILK_FOR) 
+    { 
+      grain = OMP_FOR_CLAUSES (for_stmt);
+      OMP_FOR_CLAUSES (for_stmt) = NULL_TREE;
+    }
 
   simd = TREE_CODE (for_stmt) == OMP_SIMD
     || TREE_CODE (for_stmt) == CILK_SIMD;
@@ -6677,7 +6705,12 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	}
       else
 	var = decl;
-
+  
+      /* Original initial, final and increment values are necessary to compute
+	 the loop-count.  Otherwise, they are stored in variables and their
+	 context could be changed, potentially making it impossible to compute
+	 them correctly.  */
+      orig_init = TREE_OPERAND (t, 1);
       tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 			    is_gimple_val, fb_rvalue);
       ret = MIN (ret, tret);
@@ -6689,6 +6722,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gcc_assert (COMPARISON_CLASS_P (t));
       gcc_assert (TREE_OPERAND (t, 0) == decl);
 
+      orig_cond = TREE_OPERAND (t, 1);
       tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 			    is_gimple_val, fb_rvalue);
       ret = MIN (ret, tret);
@@ -6713,6 +6747,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	    t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	    t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	    TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	    orig_incr = build_one_cst (TREE_TYPE (t));
 	    break;
 	  }
 
@@ -6726,6 +6761,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	  orig_incr = build_one_cst (TREE_TYPE (t));
 	  break;
 
 	case MODIFY_EXPR:
@@ -6753,8 +6789,16 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	      gcc_unreachable ();
 	    }
 
-	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-				is_gimple_val, fb_rvalue);
+	  orig_incr = TREE_OPERAND (t, 1);
+	  /* Right here we are just trying to extract the absolute
+	     value of the increment.  */
+	  if (TREE_CODE (t) == MINUS_EXPR
+	      || TREE_CODE  (TREE_OPERAND (t, 1)) == NEGATE_EXPR
+	      || (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST
+		  && tree_int_cst_sgn (TREE_OPERAND (t, 1)) < 1))
+	    orig_incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), orig_incr);
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), pre_p,
+				NULL, is_gimple_val, fb_rvalue);
 	  ret = MIN (ret, tret);
 	  if (c)
 	    {
@@ -6825,6 +6869,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break;
     case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break;
     case CILK_SIMD: kind = GF_OMP_FOR_KIND_CILKSIMD; break;
+    case CILK_FOR: kind = GF_OMP_FOR_KIND_CILKFOR; break;
     case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break;
     default:
       gcc_unreachable ();
@@ -6859,6 +6904,10 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gimple_omp_for_set_incr (gfor, i, TREE_OPERAND (t, 1));
     }
 
+  if (kind == GF_OMP_FOR_KIND_CILKFOR)
+    cilk_for_compute_set_count_grain (gfor, orig_init, orig_cond, orig_incr, 
+				      grain);
+
   gimplify_seq_add_stmt (pre_p, gfor);
   if (ret != GS_ALL_DONE)
     return GS_ERROR;
@@ -7880,6 +7929,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	case OMP_FOR:
 	case OMP_SIMD:
 	case CILK_SIMD:
+	case CILK_FOR:
 	case OMP_DISTRIBUTE:
 	  ret = gimplify_omp_for (expr_p, pre_p);
 	  break;
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 05fca40..142621d
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "tree-nested.h"
 #include "tree-eh.h"
+#include "cilk.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -198,6 +199,13 @@ struct omp_for_data
   struct omp_for_data_loop *loops;
 };
 
+/* A structure with necessary elements from _Cilk_for statement.  This
+   struct. node is passed in to WALK_STMT_INFO->INFO.  */
+typedef struct cilk_for_information {
+  bool found;
+  tree induction_var;
+  tree count;
+} cilk_for_info;
 
 static splay_tree all_contexts;
 static int taskreg_nesting_level;
@@ -314,6 +322,8 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
   fd->have_ordered = false;
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
   fd->chunk_size = NULL_TREE;
+  if (gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
+    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
   collapse_iter = NULL;
   collapse_count = NULL;
 
@@ -1820,29 +1830,125 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	scan_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx);
 }
 
-/* Create a new name for omp child function.  Returns an identifier.  */
+/* Create a new name for omp child function.  Returns an identifier.  If 
+   IS_CILK_FOR is true then the suffix for the child function is 
+   "_cilk_for_fn."  */
 
 static GTY(()) unsigned int tmp_ompfn_id_num;
 
 static tree
-create_omp_child_function_name (bool task_copy)
+create_omp_child_function_name (bool task_copy, bool is_cilk_for)
 {
+  if (is_cilk_for)
+    return clone_function_name (current_function_decl, "_cilk_for_fn");
   return (clone_function_name (current_function_decl,
 			       task_copy ? "_omp_cpyfn" : "_omp_fn"));
 }
 
+/* Helper function for walk_gimple_seq function.  *GSI_P is the gimple stmt.
+   iterator passed by walk_gimple_seq and *WI->INFO holds the CILK_FOR_INFO
+   structure.  This function sets the values inside this structure if it
+   finds a _Cilk_for statement in *GSI_P.  HANDLED_OPS_P is unused.  */
+
+static tree
+find_cilk_for_stmt (gimple_stmt_iterator *gsi_p,
+		    bool *handled_ops_p ATTRIBUTE_UNUSED,
+		    struct walk_stmt_info *wi)
+{
+  cilk_for_info *cf_info = (cilk_for_info *) wi->info;
+  gimple stmt = gsi_stmt (*gsi_p);
+
+  if (gimple_code (stmt) == GIMPLE_OMP_FOR
+      && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_CILKFOR)
+      /* For nested _Cilk_for statments, just look into the
+	 outer-most one.  */
+      && cf_info->found == false)
+    {
+      cf_info->found = true;
+      cf_info->induction_var = gimple_cilk_for_induction_var (stmt);
+      cf_info->count = gimple_cilk_for_loop_count (stmt);
+    }
+  return NULL_TREE;
+}
+
+/* Returns true if STMT contains a CILK_FOR statement.  If found then
+   populate *IND_VAR and *LOOP_COUNT with induction variable
+   and loop-count value.  Otherwise these values remain untouched.  
+   IND_VAR and LOOP_COUNT can be NULL and if so then they are also 
+   left untouched.  */
+
+static bool
+is_cilk_for_stmt (gimple stmt, tree *ind_var, tree *loop_count)
+{
+  if (!flag_enable_cilkplus)
+    return false;
+  if (gimple_code (stmt) == GIMPLE_OMP_PARALLEL)
+    stmt = gimple_omp_body (stmt);
+  if (gimple_code (stmt) == GIMPLE_BIND)
+    {
+      gimple_seq body = gimple_bind_body (stmt);
+      struct walk_stmt_info wi;
+      cilk_for_info cf_info;
+      memset (&cf_info, 0, sizeof (cilk_for_info));
+      memset (&wi, 0, sizeof (wi));
+      wi.info = &cf_info;
+      walk_gimple_seq (body, find_cilk_for_stmt, NULL, &wi);
+      if (cf_info.found)
+	{
+	  if (ind_var)
+	    *ind_var = cf_info.induction_var;
+	  if (loop_count)
+	    *loop_count = cf_info.count;
+	  return true;
+	}
+    }
+  return false;
+}
+
+/* Returns the type of the induction variable for the child function for
+   _Cilk_for and the types for _high and _low variables based on TYPE.  */
+
+static tree
+cilk_for_check_loop_diff_type (tree type)
+{
+  if (type == integer_type_node)
+    return type;
+  else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node))
+    { 
+      if (TYPE_UNSIGNED (type)) 
+	return uint32_type_node;
+      else
+	return integer_type_node;
+    }
+  else
+    {
+      if (TYPE_UNSIGNED (type)) 
+	return uint64_type_node;
+      else
+	return long_long_integer_type_node;
+    }
+  gcc_unreachable ();
+}
+
 /* Build a decl for the omp child function.  It'll not contain a body
    yet, just the bare decl.  */
 
 static void
 create_omp_child_function (omp_context *ctx, bool task_copy)
 {
-  tree decl, type, name, t;
+  tree decl, type, name, t, ind_var = NULL_TREE, loop_count = NULL_TREE;
 
-  name = create_omp_child_function_name (task_copy);
+  bool is_cilk_for = is_cilk_for_stmt (ctx->stmt, &ind_var, &loop_count);
+  tree cilk_var_type = (is_cilk_for ?
+    cilk_for_check_loop_diff_type (TREE_TYPE (ind_var)) : NULL_TREE);
+  
+  name = create_omp_child_function_name (task_copy, is_cilk_for);
   if (task_copy)
     type = build_function_type_list (void_type_node, ptr_type_node,
 				     ptr_type_node, NULL_TREE);
+  else if (is_cilk_for)
+    type = build_function_type_list (void_type_node, ptr_type_node,
+				     cilk_var_type, cilk_var_type, NULL_TREE);
   else
     type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
 
@@ -1892,13 +1998,46 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_CONTEXT (t) = decl;
   DECL_RESULT (decl) = t;
 
-  t = build_decl (DECL_SOURCE_LOCATION (decl),
-		  PARM_DECL, get_identifier (".omp_data_i"), ptr_type_node);
+  /* _Cilk_for's child function requires two extra parameters called 
+     __low and __high that are set the by Cilk runtime when it calls this 
+     function.  */
+  if (is_cilk_for)
+    {
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__high"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__low"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+    }
+
+  tree data_name = get_identifier (".omp_data_i");
+  if (is_cilk_for)
+    data_name = get_identifier (".cilk_for_data_i");
+  t = build_decl (DECL_SOURCE_LOCATION (decl), PARM_DECL, data_name,
+		  ptr_type_node);
   DECL_ARTIFICIAL (t) = 1;
   DECL_NAMELESS (t) = 1;
   DECL_ARG_TYPE (t) = ptr_type_node;
   DECL_CONTEXT (t) = current_function_decl;
   TREE_USED (t) = 1;
+  if (is_cilk_for)
+    DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
   DECL_ARGUMENTS (decl) = t;
   if (!task_copy)
     ctx->receiver_decl = t;
@@ -4311,6 +4450,41 @@ expand_parallel_call (struct omp_region *region, basic_block bb,
 			    false, GSI_CONTINUE_LINKING);
 }
 
+/* Builds a function call using the values from WS_ARGS and data arguments 
+   of ENTRY_STMT.  The function call is inserted into BB.  */
+
+static void
+expand_cilk_for_call (basic_block bb, gimple entry_stmt,
+		      vec <tree, va_gc> *ws_args)
+{
+  tree t, t1, t2;
+  gimple_stmt_iterator gsi;
+  vec <tree, va_gc> *args;
+
+  /* The builtin function's name, the loop-count and the grain value are
+     stored in WS_ARGS.  */
+  tree func_name = (*ws_args)[0];
+  tree count = (*ws_args)[1];
+  tree grain = (*ws_args)[2];
+
+  gsi = gsi_last_bb (bb);
+  t = gimple_omp_parallel_data_arg (entry_stmt);
+  if (t == NULL)
+    t1 = null_pointer_node;
+  else
+    t1 = build_fold_addr_expr (t);
+  t2 = build_fold_addr_expr (gimple_omp_parallel_child_fn (entry_stmt));
+  
+  vec_alloc (args, 4);
+  args->quick_push (t2);
+  args->quick_push (t1);
+  args->quick_push (count);
+  args->quick_push (grain);
+  t = build_call_expr_loc_vec (UNKNOWN_LOCATION, func_name, args);
+
+  force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false, 
+			    GSI_CONTINUE_LINKING);
+}
 
 /* Build the function call to GOMP_task to actually
    generate the task operation.  BB is the block where to insert the code.  */
@@ -4646,8 +4820,39 @@ expand_omp_taskreg (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
+  /* The way _Cilk_for is constructed in this compiler can be thought of
+     as a parallel omp_for.  But the inner workings between them are very
+     different so we need a way to differenciate between them.  Thus, we
+     added a new schedule type called OMP_CLAUSE_SCHEDULE_CILKFOR, which 
+     pretty much says that this is not a parallel omp for but a _Cilk_for
+     statement.  */
+  bool is_cilk_for =
+    (flag_enable_cilkplus && region->inner &&
+     (region->inner->sched_kind == OMP_CLAUSE_SCHEDULE_CILKFOR));
+
+  /* Extract the __high and __low parameter from the function.  */
+  tree high_arg = NULL_TREE, low_arg = NULL_TREE;
+  if (is_cilk_for)
+    {
+      for (tree ii_arg = DECL_ARGUMENTS (child_fn); ii_arg != NULL_TREE;
+	   ii_arg = TREE_CHAIN (ii_arg))
+	{
+	  if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__high"))
+	    high_arg = ii_arg;
+	  if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__low"))
+	    low_arg = ii_arg;
+	}
+      gcc_assert (high_arg);
+      gcc_assert (low_arg);
+    }
+  
   if (is_combined_parallel (region))
     ws_args = region->ws_args;
+  else if (is_cilk_for) 
+    /* If it is a _Cilk_for statement, it is modelled *like* a parallel for,
+       so the inner statement should have the information that is required
+       to by expand_cilk_for_call.  */
+    ws_args = region->inner->ws_args;
   else
     ws_args = NULL;
 
@@ -4753,6 +4958,49 @@ expand_omp_taskreg (struct omp_region *region)
 	    }
 	}
 
+      /* In here the calls to the GET_NUM_THREADS and GET_THREAD_NUM are
+	 removed.  Further, they will be replaced by __low and __high
+	 parameter values.  */
+      gimple high_assign = NULL, low_assign = NULL;
+      if (is_cilk_for)
+	{
+	  gimple_stmt_iterator gsi2 = gsi_start_bb (single_succ (entry_bb));
+	  while (!gsi_end_p (gsi2))
+	    {
+	      gimple stmt = gsi_stmt (gsi2);
+	
+	      if (gimple_call_builtin_p (stmt, BUILT_IN_OMP_GET_NUM_THREADS))
+		{
+		  /* There can only be one one call to these two functions
+		     If there are multiple, then something went wrong
+		     somewhere.  */
+		  gcc_assert (low_assign == NULL);
+		  tree ltype = TREE_TYPE (gimple_get_lhs (stmt));
+		  tree tmp2 = create_tmp_reg (TREE_TYPE (low_arg), NULL);
+		  low_assign = gimple_build_assign 
+		    (gimple_get_lhs (stmt), fold_convert (ltype, tmp2));
+		  gsi_remove (&gsi2, true);
+		  gimple tmp_stmt = gimple_build_assign (tmp2, low_arg);
+		  gsi_insert_before (&gsi2, low_assign, GSI_NEW_STMT);
+		  gsi_insert_before (&gsi2, tmp_stmt, GSI_NEW_STMT);
+		}
+	      else if (gimple_call_builtin_p (stmt,
+					      BUILT_IN_OMP_GET_THREAD_NUM))
+		{
+		  gcc_assert (high_assign == NULL);
+		  tree htype = TREE_TYPE (gimple_get_lhs (stmt));
+		  tree tmp2 = create_tmp_reg (TREE_TYPE (high_arg), NULL);
+		  
+		  high_assign = gimple_build_assign 
+		    (gimple_get_lhs (stmt), fold_convert (htype, tmp2));
+		  gsi_remove (&gsi2, true);
+		  gimple tmp_stmt = gimple_build_assign (tmp2, high_arg);
+		  gsi_insert_before (&gsi2, high_assign, GSI_NEW_STMT);
+		  gsi_insert_before (&gsi2, tmp_stmt, GSI_NEW_STMT);
+		}
+	      gsi_next (&gsi2);
+	    }
+	}      
       /* Declare local variables needed in CHILD_CFUN.  */
       block = DECL_INITIAL (child_fn);
       BLOCK_VARS (block) = vec2chain (child_cfun->local_decls);
@@ -4815,6 +5063,13 @@ expand_omp_taskreg (struct omp_region *region)
       if (loops_state_satisfies_p (LOOPS_NEED_FIXUP))
 	child_cfun->x_current_loops->state |= LOOPS_NEED_FIXUP;
 
+      /* We expand it before it is customarily done for other flavors because
+	 the call to the function __cilkrts_cilk_for_64/32 (inserted by the 
+	 function below) may use some variables and thus the function call 
+	 must be inserted before the unwanted variables are eliminated.  */
+      if (is_cilk_for)
+	expand_cilk_for_call (new_bb, entry_stmt, ws_args);
+
       /* Remove non-local VAR_DECLs from child_cfun->local_decls list.  */
       num = vec_safe_length (child_cfun->local_decls);
       for (srcidx = 0, dstidx = 0; srcidx < num; srcidx++)
@@ -4828,7 +5083,7 @@ expand_omp_taskreg (struct omp_region *region)
 	}
       if (dstidx != num)
 	vec_safe_truncate (child_cfun->local_decls, dstidx);
-
+ 
       /* Inform the callgraph about the new function.  */
       DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
       cgraph_add_new_function (child_fn, true);
@@ -4859,11 +5114,14 @@ expand_omp_taskreg (struct omp_region *region)
       pop_cfun ();
     }
 
-  /* Emit a library call to launch the children threads.  */
-  if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
-    expand_parallel_call (region, new_bb, entry_stmt, ws_args);
-  else
-    expand_task_call (new_bb, entry_stmt);
+  if (!is_cilk_for)
+    {
+      /* Emit a library call to launch the children threads.  */
+      if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
+	expand_parallel_call (region, new_bb, entry_stmt, ws_args);
+      else
+	expand_task_call (new_bb, entry_stmt);
+    }
   if (gimple_in_ssa_p (cfun))
     update_ssa (TODO_update_ssa_only_virtuals);
 }
@@ -6537,7 +6795,225 @@ expand_omp_for_static_chunk (struct omp_region *region,
 	}
     }
 }
+/* A subroutine of expand_omp_for.  Generate code for _Cilk_for loop.  
+   Given parameters: 
+   for (V = N1; V cond N2; V += STEP) BODY; 
+   
+   where COND is "<" or ">", we generate pseudocode
+    
+   for (ind_var = low; ind_var < high; ind_var++)
+   {  
+      if (n1 < n2)
+	V = n1 + (ind_var * STEP)
+      else
+        V = n2 - (ind_var * STEP);
+
+      <BODY>
+    }  
+  
+    In the above pseudocode, low and high are function parameters of the
+    child function.  In the function below, we are inserting a temp.
+    variable that will be making a call to two OMP functions that will not be
+    found in the body of _Cilk_for (since OMP_FOR cannot be mixed 
+    with _Cilk_for).  These functions are replaced with low and high 
+    by the function that handleds taskreg.  */
+
 
+static void
+expand_cilk_for (struct omp_region *region, struct omp_for_data *fd)
+{
+  bool broken_loop = region->cont == NULL;
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  basic_block entry_bb = region->entry;
+  basic_block cont_bb = region->cont;
+  
+  gcc_assert (EDGE_COUNT (entry_bb->succs) == 2);
+  gcc_assert (broken_loop
+	      || BRANCH_EDGE (entry_bb)->dest == FALLTHRU_EDGE (cont_bb)->dest);
+  basic_block l0_bb = FALLTHRU_EDGE (entry_bb)->dest;
+  basic_block l1_bb, l2_bb;
+
+  tree grain = gimple_cilk_for_grain (fd->for_stmt);
+  if (!broken_loop)
+    {
+      gcc_assert (BRANCH_EDGE (cont_bb)->dest == l0_bb);
+      gcc_assert (EDGE_COUNT (cont_bb->succs) == 2);
+      l1_bb = split_block (cont_bb, last_stmt (cont_bb))->dest;
+      l2_bb = BRANCH_EDGE (entry_bb)->dest;
+    }
+  else
+    {
+      BRANCH_EDGE (entry_bb)->flags &= ~EDGE_ABNORMAL;
+      l1_bb = split_edge (BRANCH_EDGE (entry_bb));
+      l2_bb = single_succ (l1_bb);
+    }
+  basic_block exit_bb = region->exit;
+  basic_block l2_dom_bb = NULL;
+
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+
+  /* Below statements until the "tree high_val = ..." are pseudo statements 
+     used to pass information to be used by expand_omp_taskreg.
+     low_val and high_val will be replaced by the __low and __high
+     parameter from the child function.
+
+     The call_exprs part is a place-holder, it is mainly used 
+     to distinctly identify to the top-level part that this is
+     where we should put low and high (reasoning given in header 
+     comment).  */
+
+  tree t = build_call_expr
+    (builtin_decl_explicit (BUILT_IN_OMP_GET_NUM_THREADS), 0);
+  t = fold_convert (type, t);
+  tree low_val = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+					   GSI_SAME_STMT);
+  t = build_call_expr (builtin_decl_explicit (BUILT_IN_OMP_GET_THREAD_NUM),
+		       0);
+  t = fold_convert (TREE_TYPE (fd->loop.v), t);
+  tree high_val = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+					   GSI_SAME_STMT);
+
+  tree ind_var = create_tmp_reg (type, "__cilk_ind_var");
+  gcc_assert (gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_FOR);
+  
+  /* Not needed in SSA form right now.  */
+  gcc_assert (!gimple_in_ssa_p (cfun));
+  if (l2_dom_bb == NULL)
+    l2_dom_bb = l1_bb;
+
+  tree n1 = low_val;
+  tree n2 = high_val;
+  
+  expand_omp_build_assign (&gsi, ind_var, n1);
+
+  /* Remove the GIMPLE_OMP_FOR statement.  */
+  gsi_remove (&gsi, true);
+
+  gimple stmt;
+  if (!broken_loop)
+    {
+      /* Code to control the increment goes in the CONT_BB.  */
+      gsi = gsi_last_bb (cont_bb);
+      stmt = gsi_stmt (gsi);
+      gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE);
+      enum tree_code code = PLUS_EXPR;
+      if (POINTER_TYPE_P (type))
+	t = fold_build_pointer_plus (ind_var, build_one_cst (type)); 
+      else
+	t = fold_build2 (code, type, ind_var, build_one_cst (type));
+      expand_omp_build_assign (&gsi, ind_var, t);
+
+      /* Remove GIMPLE_OMP_CONTINUE.  */
+      gsi_remove (&gsi, true);
+    }
+
+  /* Emit the condition in L1_BB.  */
+  gsi = gsi_start_bb (l1_bb);
+
+  tree step = fold_convert (type, fd->loop.step);
+  if ((TREE_CODE (step) == INTEGER_CST && tree_int_cst_sgn (step) < 1)) 
+    step = fold_build1_loc (UNKNOWN_LOCATION, NEGATE_EXPR, type, step);
+  
+  t = build2 (MULT_EXPR, type, ind_var, step);
+  tree tmp = create_tmp_reg (type, NULL);
+  gsi_insert_after (&gsi, gimple_build_assign (tmp, t), GSI_NEW_STMT);
+
+  tree tmp2 = create_tmp_reg (type, NULL);
+  tree cvtd = fold_convert (type, fd->loop.n1);
+  gsi_insert_after (&gsi, gimple_build_assign (tmp2, cvtd), GSI_NEW_STMT);
+  
+  if (fd->loop.cond_code == GE_EXPR || fd->loop.cond_code == GT_EXPR)
+    t = fold_build2 (MINUS_EXPR, type, tmp2, tmp);
+ else
+   t = fold_build2 (PLUS_EXPR, type, tmp2, tmp);
+
+  tmp = create_tmp_reg (type, NULL);
+  gsi_insert_after (&gsi, gimple_build_assign (tmp, t), GSI_NEW_STMT);
+
+  cvtd = fold_convert (TREE_TYPE (fd->loop.v), tmp);
+  gsi_insert_after (&gsi, gimple_build_assign (fd->loop.v, cvtd), 
+		    GSI_NEW_STMT);
+  
+  t = fold_convert (type, n2);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+				false, GSI_CONTINUE_LINKING);
+  /* The condition is always '<' since the runtime will fill in the low
+     and high values.  */
+  t = build2 (LT_EXPR, boolean_type_node, ind_var, t);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p,
+		 NULL, NULL)
+      || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p,
+		    NULL, NULL))
+    {
+      gsi = gsi_for_stmt (stmt);
+      gimple_regimplify_operands (stmt, &gsi);
+    }
+
+  /* Remove GIMPLE_OMP_RETURN.  */
+  gsi = gsi_last_bb (exit_bb);
+  gsi_remove (&gsi, true);
+
+  /* Connect the new blocks.  */
+  remove_edge (FALLTHRU_EDGE (entry_bb));
+
+  edge e, ne;
+  if (!broken_loop)
+    {
+      remove_edge (BRANCH_EDGE (entry_bb));
+      make_edge (entry_bb, l1_bb, EDGE_FALLTHRU);
+
+      e = BRANCH_EDGE (l1_bb);
+      ne = FALLTHRU_EDGE (l1_bb);
+      e->flags = EDGE_TRUE_VALUE;
+    }
+  else
+    {
+      single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
+
+      ne = single_succ_edge (l1_bb);
+      e = make_edge (l1_bb, l0_bb, EDGE_TRUE_VALUE);
+
+    }
+  ne->flags = EDGE_FALSE_VALUE;
+  e->probability = REG_BR_PROB_BASE * 7 / 8;
+  ne->probability = REG_BR_PROB_BASE / 8;
+
+  set_immediate_dominator (CDI_DOMINATORS, l1_bb, entry_bb);
+  set_immediate_dominator (CDI_DOMINATORS, l2_bb, l2_dom_bb);
+  set_immediate_dominator (CDI_DOMINATORS, l0_bb, l1_bb);
+
+  if (!broken_loop)
+    {
+      struct loop *loop = alloc_loop ();
+      loop->header = l1_bb;
+      loop->latch = cont_bb;
+      add_loop (loop, l1_bb->loop_father);
+      loop->safelen = INT_MAX;
+    }
+
+  /* Pick the correct library function based on the precision of the
+     induction variable type.  */
+  tree lib_fun = NULL_TREE;
+  if (TYPE_PRECISION (type) == 32)
+    lib_fun = cilk_for_32_fndecl;
+  else if (TYPE_PRECISION (type) == 64)
+    lib_fun = cilk_for_64_fndecl;
+  else
+    gcc_unreachable ();
+  
+  tree count = gimple_cilk_for_loop_count (fd->for_stmt);
+  gcc_assert (count != NULL_TREE);
+  
+  /* ws_args contains three information: The library function flavor to call
+     (__cilkrts_cilk_for_32/__cilkrts_cilk_for_64) loop_count and the grain 
+     value.  */
+  vec_alloc (region->ws_args, 3);
+  region->ws_args->quick_push (lib_fun);
+  region->ws_args->quick_push (count);
+  region->ws_args->quick_push (grain);
+}
 
 /* A subroutine of expand_omp_for.  Generate code for a simd non-worksharing
    loop.  Given parameters:
@@ -6878,6 +7354,8 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt)
 
   if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_SIMD)
     expand_omp_simd (region, &fd);
+  else if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_CILKFOR)
+    expand_cilk_for (region, &fd);
   else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
 	   && !fd.have_ordered)
     {
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
new file mode 100644
index 0000000..3f68022
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
@@ -0,0 +1,100 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+static void check (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start;  ii < end; ii = ii + incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+static void check_reverse (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start; ii >= end; ii = ii - incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+
+int main (void)
+{
+  int Array[10];
+  int x = 9, y = 0, z = 3;
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 1133;
+  check (Array, 0, 10, 1, 1133);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 9; ii > -1; ii--)
+    Array[ii] = 4433;
+  check_reverse (Array, 9, 0, 1, 4433);
+
+  _Cilk_for (int ii = 9; ii > -1; --ii)
+    Array[ii] = 9988;
+  check_reverse (Array, 9, 0, 1, 9988);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    Array[ii] = 1328;
+  check (Array, 0, 10, 2, 1328);
+
+  _Cilk_for (int ii = 9; ii >= 0; ii -= 2)
+    Array[ii] = 1738;
+  check_reverse (Array, 9, 0, 2, 1738);
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if (ii % 2)
+	Array[ii] = 1343;
+      else
+	Array[ii] = 3413;
+    }
+
+  check (Array, 1, 10, 2, 1343); 
+  check (Array, 0, 10, 2, 3413); 
+
+  _Cilk_for (short cc = 0; cc < 10; cc++) 
+    Array[cc] = 1343;
+  check (Array, 0, 10,  1,1343);
+
+  _Cilk_for (short cc = 9; cc >= 0; cc--)
+    Array[cc] = 1348;
+  check_reverse (Array, 9, 0, 1, 1348);
+
+
+
+  /* Loop with polynomials in the start, final and incr.  */
+  _Cilk_for (int ii = z - 3; ii <= z * 3; ii += z-1)
+    { 
+      Array[ii] = 3233;
+    }
+
+  for (int ii = z-3; ii <= z*3; ii += (z-1))
+    if (Array[ii] != 3233)
+      __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
new file mode 100644
index 0000000..0ebc09a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+int main (void)
+{
+  int q = 0, ii = 0, jj = 0;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+
+  _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 5;
+
+  _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
+    q = 5;
+
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static variable" } */
+    q = 5;
+
+  _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */
+    q = 5;
+
+  _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
+    q = 5;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
new file mode 100644
index 0000000..6cb9b03
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
@@ -0,0 +1,35 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int grain_value = 2;
+int main (void)
+{
+  int Array1[200], Array1_Serial[200];
+
+  for (int ii = 0; ii < 200; ii++)
+    {
+      Array1_Serial[ii] = 2;
+      Array1[ii] = 1;
+    }
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 200; ii++)
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+#pragma cilk grainsize = grain_value
+  _Cilk_for (int ii = 0; ii < 200; ii++) 
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
new file mode 100644
index 0000000..ff8bc0a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+char Array1[26];
+
+#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */
+
+int main(int argc, char **argv)
+{
+/* This is OK.  */
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+
+/* This is OK, it will do a type conversion to long int.  */
+#pragma cilk grainsize = 0.5 
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize = 1 
+  while (Array1[5] != 0) /* { dg-warning "grainsize pragma is not followed" } */
+    {
+    /* Blah */
+    }
+
+#pragma cilk grainsize = 1 
+  int q = 0; /* { dg-warning "grainsize pragma is not followed" } */
+  _Cilk_for (q = 0; q < 10; q++)
+    Array1[q]  = 5;
+
+  while (Array1[5] != 0)
+    {
+    /* Blah */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
new file mode 100644
index 0000000..7a779f7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
@@ -0,0 +1,41 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+/* <feature> loop control variable must have integer, pointer or class type
+   </feature>
+*/
+
+#define ARRAY_SIZE 10000
+int a[ARRAY_SIZE];
+
+int main(void)
+{ 
+  int ii = 0;
+
+#if 1
+  for (ii =0; ii < ARRAY_SIZE; ii++)
+    a[ii] = 5;
+#endif
+  _Cilk_for(int *aa = a; aa < a + ARRAY_SIZE; aa++) 
+    *aa = 0;
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii++) 
+    if (a[ii] != 0) 
+      __builtin_abort ();
+#endif
+
+  _Cilk_for (int *aa = a; aa < a + ARRAY_SIZE; aa = aa + 2)
+    *aa = 4;
+
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii = ii + 2) 
+    if (a[ii] != 4) 
+      __builtin_abort ();
+#endif
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
new file mode 100644
index 0000000..cffe17e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
@@ -0,0 +1,79 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+int main (void)
+{
+  int Array[10][10];
+
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj++)
+	{
+	  Array[ii][jj] = 0;
+	}
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 5; jj++)
+      Array[ii][jj] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 5; jj++)
+      if (Array[ii][jj] != 5)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+
+  /* One goes up and one goes down.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 9; jj >= 0; jj--)
+      Array[ii][jj] = 7;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 9; jj >= 0; jj--)
+      if (Array[ii][jj] != 7)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 10; jj += 2)
+      Array[ii][jj] = 9;
+  
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj += 2)
+      if (Array[ii][jj] != 9)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    _Cilk_for (int jj = 5; jj < 9; jj++)
+      Array[ii][jj] = 11; 
+  
+  for (int ii = 0; ii < 10; ii += 2)
+    for (int jj = 5; jj < 9; jj++)
+      if (Array[ii][jj] != 11)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 0822d35..9412bab 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -351,6 +351,7 @@ enum omp_clause_schedule_kind {
   OMP_CLAUSE_SCHEDULE_GUIDED,
   OMP_CLAUSE_SCHEDULE_AUTO,
   OMP_CLAUSE_SCHEDULE_RUNTIME,
+  OMP_CLAUSE_SCHEDULE_CILKFOR,
   OMP_CLAUSE_SCHEDULE_LAST
 };
 
diff --git a/gcc/tree.def b/gcc/tree.def
index 364e510..0a32bc4 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1051,6 +1051,10 @@ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6)
    Operands like for OMP_FOR.  */
 DEFTREECODE (CILK_SIMD, "cilk_simd", tcc_statement, 6)
 
+/* Cilk Plus - _Cilk_for (..)
+   Operands like for OMP_FOR.  */
+DEFTREECODE (CILK_FOR, "cilk_for", tcc_statement, 6)
+
 /* OpenMP - #pragma omp distribute [clause1 ... clauseN]
    Operands like for OMP_FOR.  */
 DEFTREECODE (OMP_DISTRIBUTE, "omp_distribute", tcc_statement, 6)

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-12-16  0:40                     ` Iyer, Balaji V
@ 2013-12-16 21:21                       ` Jason Merrill
  2013-12-16 23:41                         ` Iyer, Balaji V
                                           ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Jason Merrill @ 2013-12-16 21:21 UTC (permalink / raw)
  To: Iyer, Balaji V, 'Jeff Law', 'Aldy Hernandez'
  Cc: 'gcc-patches@gcc.gnu.org', 'rth@redhat.com',
	'Jakub Jelinek'

On 12/15/2013 07:40 PM, Iyer, Balaji V wrote:
> -		       tree clauses, tree *cclauses)
> +		       tree clauses_or_grain, tree *cclauses)

Instead of this, please make the grainsize a new type of clause.

> -  return (gimple_omp_subcode (g) & GF_OMP_FOR_COMBINED) != 0;
> +  return (gimple_omp_for_kind (g) == GF_OMP_FOR_COMBINED);

I don't really know this code, but this change seems unlikely to be 
correct.  Can you explain it?

> +  tree data_name = get_identifier (".omp_data_i");
> +  if (is_cilk_for)
> +    data_name = get_identifier (".cilk_for_data_i");

Why does the name of an artificial parameter matter?

>  }
> +/* A subroutine of expand_omp_for.  Generate code for _Cilk_for loop.
> +   Given parameters:

Need a blank line after the }.

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2013-12-16 21:21                       ` Jason Merrill
@ 2013-12-16 23:41                         ` Iyer, Balaji V
  2013-12-18  0:22                         ` Iyer, Balaji V
  2014-01-06 22:29                         ` Iyer, Balaji V
  2 siblings, 0 replies; 42+ messages in thread
From: Iyer, Balaji V @ 2013-12-16 23:41 UTC (permalink / raw)
  To: Jason Merrill, 'Jeff Law', 'Aldy Hernandez'
  Cc: 'gcc-patches@gcc.gnu.org', 'rth@redhat.com',
	'Jakub Jelinek'

> 
> > -  return (gimple_omp_subcode (g) & GF_OMP_FOR_COMBINED) != 0;
> > +  return (gimple_omp_for_kind (g) == GF_OMP_FOR_COMBINED);
> 
> I don't really know this code, but this change seems unlikely to be correct.
> Can you explain it?

I really need help on this. I need a new enum type (I call this: GF_OMP_FOR_KIND_CILKFOR) that I need to mark _Cilk_for so that I can differentiate it. Can someone please help me pick a new one or explain to me how the enum gf_mask is structured? I am not understanding the header comment  fully...

Thanks,

Balaji V. Iyer.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2013-12-16 21:21                       ` Jason Merrill
  2013-12-16 23:41                         ` Iyer, Balaji V
@ 2013-12-18  0:22                         ` Iyer, Balaji V
  2014-01-07 20:40                           ` Jason Merrill
  2014-01-06 22:29                         ` Iyer, Balaji V
  2 siblings, 1 reply; 42+ messages in thread
From: Iyer, Balaji V @ 2013-12-18  0:22 UTC (permalink / raw)
  To: Jason Merrill, 'Jeff Law', 'Aldy Hernandez'
  Cc: 'gcc-patches@gcc.gnu.org', 'rth@redhat.com',
	'Jakub Jelinek'

[-- Attachment #1: Type: text/plain, Size: 1616 bytes --]

Hi Jason,
	Here is a fixed patch. I have also answered your questions below.

Thanks,

Balaji V. Iyer.

> -----Original Message-----
> From: Jason Merrill [mailto:jason@redhat.com]
> Sent: Monday, December 16, 2013 4:22 PM
> To: Iyer, Balaji V; 'Jeff Law'; 'Aldy Hernandez'
> Cc: 'gcc-patches@gcc.gnu.org'; 'rth@redhat.com'; 'Jakub Jelinek'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On 12/15/2013 07:40 PM, Iyer, Balaji V wrote:
> > -		       tree clauses, tree *cclauses)
> > +		       tree clauses_or_grain, tree *cclauses)
> 
> Instead of this, please make the grainsize a new type of clause.
> 

The reason why I store it in OMP_FOR_CLAUSE is because OMP clauses cannot occur in _Cilk_for. So adding a new clause seem to be an overkill IMHO. I need a place to store the grain value and so I chose this spot.

> > -  return (gimple_omp_subcode (g) & GF_OMP_FOR_COMBINED) != 0;
> > +  return (gimple_omp_for_kind (g) == GF_OMP_FOR_COMBINED);
> 
> I don't really know this code, but this change seems unlikely to be correct.
> Can you explain it?
> 

Yep it is wrong. I have reverted it. I have moved around the gf_task bits.

> > +  tree data_name = get_identifier (".omp_data_i");  if (is_cilk_for)
> > +    data_name = get_identifier (".cilk_for_data_i");
> 
> Why does the name of an artificial parameter matter?
> 

Well, it helps differenciate between the two.. I have reverted it.

> >  }
> > +/* A subroutine of expand_omp_for.  Generate code for _Cilk_for loop.
> > +   Given parameters:
> 
> Need a blank line after the }.
> 

Fixed.

> Jason


[-- Attachment #2: diff2.txt --]
[-- Type: text/plain, Size: 55264 bytes --]

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
old mode 100644
new mode 100755
index cfaeaf0..c3dcb21
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -414,6 +414,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Complex",		RID_COMPLEX,	0 },
   { "_Cilk_spawn",      RID_CILK_SPAWN, 0 },
   { "_Cilk_sync",       RID_CILK_SYNC,  0 },
+  { "_Cilk_for",        RID_CILK_FOR,   0 },
   { "_Imaginary",	RID_IMAGINARY, D_CONLY },
   { "_Decimal32",       RID_DFLOAT32,  D_CONLY | D_EXT },
   { "_Decimal64",       RID_DFLOAT64,  D_CONLY | D_EXT },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
old mode 100644
new mode 100755
index 4357d1f..508de30
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -149,7 +149,7 @@ enum rid
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
   /* Cilk Plus keywords.  */
-  RID_CILK_SPAWN, RID_CILK_SYNC,
+  RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR,
   
   /* Objective-C ("AT" reserved words - they are only keywords when
      they follow '@')  */
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
old mode 100644
new mode 100755
index 3ccf8f9..6ae0a0f
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -386,7 +386,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
   bool fail = false;
   int i;
 
-  if (code == CILK_SIMD
+  if ((code == CILK_SIMD || code == CILK_FOR) 
       && !c_check_cilk_loop (locus, TREE_VEC_ELT (declv, 0)))
     fail = true;
 
@@ -516,7 +516,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 					    0))
 		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
 					 ? LT_EXPR : GE_EXPR);
-		  else if (code != CILK_SIMD)
+		  else if (code != CILK_SIMD && code != CILK_FOR)
 		    cond_ok = false;
 		}
 	    }
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
old mode 100644
new mode 100755
index 64a5b66..9d6efc5
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1394,6 +1394,11 @@ init_pragma (void)
 
   cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false,
 				false);
+
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    cpp_register_deferred_pragma (parse_in, "cilk", "grainsize",
+				  PRAGMA_CILK_GRAINSIZE, true, false);
+
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, "pack", handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
old mode 100644
new mode 100755
index 5379b9e..ef62653
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -55,6 +55,9 @@ typedef enum pragma_kind {
   /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
   PRAGMA_CILK_SIMD,
 
+  /* This pragma handles setting of grainsize for a _Cilk_for.  */
+  PRAGMA_CILK_GRAINSIZE,
+
   PRAGMA_GCC_PCH_PREPROCESS,
   PRAGMA_IVDEP,
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
old mode 100644
new mode 100755
index c78d269..3d2a6e0
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1242,9 +1242,10 @@ static bool c_parser_objc_diagnose_bad_element_prefix
   (c_parser *, struct c_declspecs *);
 
 /* Cilk Plus supporting routines.  */
-static void c_parser_cilk_simd (c_parser *);
+static void c_parser_cilk_simd (c_parser *, bool, tree);
 static bool c_parser_cilk_verify_simd (c_parser *, enum pragma_context);
 static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
+static void c_parser_cilk_grainsize (c_parser *);
 
 /* Parse a translation unit (C90 6.7, C99 6.9).
 
@@ -4776,6 +4777,16 @@ c_parser_statement_after_labels (c_parser *parser)
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
 	  break;
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (c_parser_peek_token (parser)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      c_parser_skip_to_end_of_block_or_statement (parser);
+	    }
+	  else
+	    c_parser_cilk_simd (parser, true, integer_zero_node);
+	  break;
 	case RID_CILK_SYNC:
 	  c_parser_consume_token (parser);
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
@@ -9386,7 +9397,24 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       if (!c_parser_cilk_verify_simd (parser, context))
 	return false;
       c_parser_consume_pragma (parser);
-      c_parser_cilk_simd (parser);
+      c_parser_cilk_simd (parser, false, NULL_TREE);
+      return false;
+    case PRAGMA_CILK_GRAINSIZE:
+      if (!flag_enable_cilkplus)
+	{
+	  warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not"
+		   " enabled");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      if (context == pragma_external)
+	{
+	  error_at (c_parser_peek_token (parser)->location,
+		    "%<#pragma grainsize%> must be inside a function");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      c_parser_cilk_grainsize (parser);
       return false;
 
     default:
@@ -11460,7 +11488,7 @@ c_parser_omp_flush (c_parser *parser)
 
 static tree
 c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
-		       tree clauses, tree *cclauses)
+		       tree clauses_or_grain, tree *cclauses)
 {
   tree decl, cond, incr, save_break, save_cont, body, init, stmt, cl;
   tree declv, condv, incrv, initv, ret = NULL;
@@ -11468,6 +11496,8 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   int i, collapse = 1, nbraces = 0;
   location_t for_loc;
   vec<tree, va_gc> *for_block = make_tree_vector ();
+  tree clauses = code == CILK_FOR ? NULL_TREE : clauses_or_grain;
+  tree grain = code == CILK_FOR ? clauses_or_grain : NULL_TREE;
 
   for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl))
     if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE)
@@ -11480,11 +11510,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   condv = make_tree_vec (collapse);
   incrv = make_tree_vec (collapse);
 
-  if (!c_parser_next_token_is_keyword (parser, RID_FOR))
+  if (code != CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_FOR))
     {
       c_parser_error (parser, "for statement expected");
       return NULL;
     }
+  if (code == CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_CILK_FOR))
+    {
+      c_parser_error (parser, "_Cilk_for statement expected");
+      return NULL;
+    }
   for_loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
 
@@ -11562,7 +11599,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	    case LE_EXPR:
 	      break;
 	    case NE_EXPR:
-	      if (code == CILK_SIMD)
+	      if (code == CILK_SIMD || code == CILK_FOR)
 		break;
 	      /* FALLTHRU.  */
 	    default:
@@ -11736,6 +11773,11 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 		  }
 	    }
 	  OMP_FOR_CLAUSES (stmt) = clauses;
+	  /* If it is a _Cilk_for statement, then the OMP_FOR_CLAUSES location
+	     stores the user-defined grain value or an integer_zero_node 
+	     indicating that the runtime must compute a suitable grain.  */
+	  if (code == CILK_FOR)
+	    OMP_FOR_CLAUSES (stmt) = grain;
 	}
       ret = stmt;
     }
@@ -13566,18 +13608,75 @@ c_parser_cilk_all_clauses (c_parser *parser)
   return c_finish_cilk_clauses (clauses);
 }
 
+/* This function helps parse the grainsize pragma for a _Cilk_for statement. 
+   Here is the correct syntax of this pragma: 
+	    #pragma cilk grainsize = <EXP> 
+ */
+
+static void
+c_parser_cilk_grainsize (c_parser *parser)
+{
+  extern tree convert_to_integer (tree, tree);
+
+  /* consume the 'grainsize' keyword.  */
+  c_parser_consume_pragma (parser);
+
+  if (c_parser_require (parser, CPP_EQ, "expected %<=%>") != 0)
+    {
+      struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL);
+      if (g_expr.value && TREE_CODE (g_expr.value) == C_MAYBE_CONST_EXPR)
+	{
+	  error_at (input_location, "cannot convert grain to long integer.\n");
+	  c_parser_skip_to_pragma_eol (parser);
+	}   
+      else if (g_expr.value && g_expr.value != error_mark_node)
+	{
+	  c_parser_skip_to_pragma_eol (parser);
+	  c_token *token = c_parser_peek_token (parser);
+	  if (token && token->type == CPP_KEYWORD
+	      && token->keyword == RID_CILK_FOR)
+	    {
+	      tree grain = convert_to_integer (long_integer_type_node,
+					       g_expr.value);
+	      if (grain && grain != error_mark_node) 
+		c_parser_cilk_simd (parser, true, grain);
+	    }
+	  else
+	    warning (0, "grainsize pragma is not followed by %<_Cilk_for%>");
+	}
+      else
+	c_parser_skip_to_pragma_eol (parser);
+    }
+  else
+    c_parser_skip_to_pragma_eol (parser);
+}
+
 /* Main entry point for parsing Cilk Plus <#pragma simd> for
    loops.  */
 
 static void
-c_parser_cilk_simd (c_parser *parser)
+c_parser_cilk_simd (c_parser *parser, bool is_cilk_for, tree grain)
 {
-  tree clauses = c_parser_cilk_all_clauses (parser);
+  tree super_block = NULL_TREE;
+  tree clauses = NULL_TREE;
+  
+  if (!is_cilk_for)
+    clauses = c_parser_cilk_all_clauses (parser);
+  else
+    { 
+      super_block = c_begin_omp_parallel ();
+      clauses = grain;
+    }
   tree block = c_begin_compound_stmt (true);
   location_t loc = c_parser_peek_token (parser)->location;
-  c_parser_omp_for_loop (loc, parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  c_parser_omp_for_loop (loc, parser, code, clauses, NULL);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
+  if (is_cilk_for) 
+    /* The term super_block is not used in scheduling terms but in 
+       set-theory, i.e. set vs. super-set.  */ 
+    c_finish_omp_parallel (loc, NULL_TREE, super_block);
 }
 \f
 /* Parse a transaction attribute (GCC Extension).
diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def
old mode 100644
new mode 100755
index 8634194..7f8f97a
--- a/gcc/cilk-builtins.def
+++ b/gcc/cilk-builtins.def
@@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64")
diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c
old mode 100644
new mode 100755
index 52b3785..2574f12
--- a/gcc/cilk-common.c
+++ b/gcc/cilk-common.c
@@ -106,6 +106,26 @@ install_builtin (const char *name, tree fntype, enum built_in_function code,
   return fndecl;
 }
 
+/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
+
+static tree
+cilk_declare_looper (const char *name, tree type, enum built_in_function code)
+{
+  tree cb, ft, fn;
+
+  cb = build_function_type_list (void_type_node,
+                                 ptr_type_node, type, type,
+                                 NULL_TREE);
+  cb = build_pointer_type (cb);
+  ft = build_function_type_list (void_type_node,
+                                 cb, ptr_type_node, type,
+                                 integer_type_node, NULL_TREE);
+  fn = install_builtin (name, ft, code, false);
+  TREE_NOTHROW (fn) = 0;
+
+  return fn;
+}
+
 /* Creates and initializes all the built-in Cilk keywords functions and three
    structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker.
    Detailed information about __cilkrts_stack_frame and
@@ -269,6 +289,15 @@ cilk_init_builtins (void)
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 
 					 fptr_fun, BUILT_IN_CILK_SAVE_FP,
 					 false);
+  /* __cilkrts_cilk_for_32 (...);  */
+  cilk_for_32_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_32",
+                                            unsigned_intSI_type_node,
+                                            BUILT_IN_CILK_FOR_32);
+  /* __cilkrts_cilk_for_64 (...);  */
+  cilk_for_64_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_64",
+                                            unsigned_intDI_type_node,
+                                            BUILT_IN_CILK_FOR_64);
+
 }
 
 /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR.  */
diff --git a/gcc/cilk.h b/gcc/cilk.h
old mode 100644
new mode 100755
index e990992..ea9f6ff
--- a/gcc/cilk.h
+++ b/gcc/cilk.h
@@ -40,6 +40,9 @@ enum cilk_tree_index  {
   CILK_TI_F_POP,                      /* __cilkrts_pop_frame (...).  */
   CILK_TI_F_RETHROW,                  /* __cilkrts_rethrow (...).  */
   CILK_TI_F_SAVE_FP,                  /* __cilkrts_save_fp_ctrl_state (...).  */
+  CILK_TI_F_LOOP_32,                  /* __cilkrts_cilk_for_32 (...).  */
+  CILK_TI_F_LOOP_64,                  /* __cilkrts_cilk_for_64 (...).  */
+
   /* __cilkrts_stack_frame struct fields.  */
   CILK_TI_FRAME_FLAGS,                /* stack_frame->flags.  */
   CILK_TI_FRAME_PARENT,               /* stack_frame->parent.  */
@@ -77,6 +80,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 #define cilk_rethrow_fndecl           cilk_trees[CILK_TI_F_RETHROW]
 #define cilk_pop_fndecl               cilk_trees[CILK_TI_F_POP]
 #define cilk_save_fp_fndecl           cilk_trees[CILK_TI_F_SAVE_FP]
+#define cilk_for_32_fndecl            cilk_trees[CILK_TI_F_LOOP_32]
+#define cilk_for_64_fndecl            cilk_trees[CILK_TI_F_LOOP_64]
 
 #define cilk_worker_type_fndecl       cilk_trees[CILK_TI_WORKER_TYPE]
 #define cilk_frame_type_decl          cilk_trees[CILK_TI_FRAME_TYPE]
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 42e3f5f..867aa52 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1158,6 +1158,8 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_DISTRIBUTE:
 	  pp_string (buffer, "#pragma omp distribute");
 	  break;
+	case GF_OMP_FOR_KIND_CILKFOR:
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -1167,7 +1169,10 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	  if (i)
 	    spc += 2;
 	  newline_and_indent (buffer, spc);
-	  pp_string (buffer, "for (");
+	  if (gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
+	    pp_string (buffer, "_Cilk_for (");
+	  else
+	    pp_string (buffer, "for (");
 	  dump_generic_node (buffer, gimple_omp_for_index (gs, i), spc,
 			     flags, false);
 	  pp_string (buffer, " = ");
diff --git a/gcc/gimple.h b/gcc/gimple.h
old mode 100644
new mode 100755
index a49016f..1c0e049
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -91,13 +91,14 @@ enum gf_mask {
     GF_CALL_ALLOCA_FOR_VAR	= 1 << 5,
     GF_CALL_INTERNAL		= 1 << 6,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
-    GF_OMP_FOR_KIND_MASK	= 3 << 0,
+    GF_OMP_FOR_KIND_MASK	= 7 << 0,
     GF_OMP_FOR_KIND_FOR		= 0 << 0,
     GF_OMP_FOR_KIND_DISTRIBUTE	= 1 << 0,
     GF_OMP_FOR_KIND_SIMD	= 2 << 0,
     GF_OMP_FOR_KIND_CILKSIMD	= 3 << 0,
-    GF_OMP_FOR_COMBINED		= 1 << 2,
-    GF_OMP_FOR_COMBINED_INTO	= 1 << 3,
+    GF_OMP_FOR_KIND_CILKFOR     = 4 << 0,
+    GF_OMP_FOR_COMBINED	= 1 << 3,
+    GF_OMP_FOR_COMBINED_INTO	= 1 << 4,
     GF_OMP_TARGET_KIND_MASK	= 3 << 0,
     GF_OMP_TARGET_KIND_REGION	= 0 << 0,
     GF_OMP_TARGET_KIND_DATA	= 1 << 0,
@@ -523,6 +524,12 @@ struct GTY(()) gimple_omp_for_iter {
 
   /* Increment.  */
   tree incr;
+
+  /* Loop count, only used by _Cilk_for.  */
+  tree loop_count;
+
+  /* Grain value, only used by _Cilk_for.  */
+  tree grain;
 };
 
 /* GIMPLE_OMP_FOR */
@@ -4562,6 +4569,58 @@ gimple_omp_for_set_pre_body (gimple gs, gimple_seq pre_body)
   omp_for_stmt->pre_body = pre_body;
 }
 
+/* Set COUNT to be the loop count value for OMP_FOR GS.  */
+
+static inline void
+gimple_cilk_for_set_count (tree count, gimple gs)
+{
+  const gimple_statement_omp_for *omp_for_stmt =
+    as_a <gimple_statement_omp_for> (gs);
+  omp_for_stmt->iter[0].loop_count = count;
+}
+
+/* Set GRAIN to be the grain value used by Cilk runtime for OMP_FOR GS.  */
+
+static inline void
+gimple_cilk_for_set_grain (tree grain, gimple gs)
+{
+  const gimple_statement_omp_for *omp_for_stmt =
+    as_a <gimple_statement_omp_for> (gs);
+  omp_for_stmt->iter[0].grain = grain;
+}
+
+/* Returns the induction variable of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_induction_var (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->index;
+}
+
+/* Returns the LOOP_COUNT value of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_loop_count (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->loop_count;
+}
+
+/* Returns the GRAIN value of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_grain (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->grain;
+}
 
 /* Return the clauses associated with OMP_PARALLEL GS.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 1ca847a..bcc5ede5 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6545,6 +6545,26 @@ find_combined_omp_for (tree *tp, int *walk_subtrees, void *)
   return NULL_TREE;
 }
 
+/* Computes the loop count, absolute value of  (FINAL-INIT)/STEP and store
+   them in CFOR->ITER->LOOP_COUNT.  GRAIN is stored in CFOR->ITER->GRAIN.  */
+
+static void
+cilk_for_compute_set_count_grain (gimple cfor, tree init, tree final, 
+				  tree incr, tree grain)
+{
+  enum tree_code cond = gimple_omp_for_cond (cfor, 0);
+  tree type = TREE_TYPE (init);
+  tree m = fold_build2 (MINUS_EXPR, type, final, init);
+  if (cond == GT_EXPR || cond == GE_EXPR)
+    m = fold_build1 (NEGATE_EXPR, TREE_TYPE (m), m);
+
+  tree t = fold_build2 (TRUNC_DIV_EXPR, type, m, incr);
+  if (cond == LE_EXPR || cond == GE_EXPR)
+    t = fold_build2 (PLUS_EXPR, type, t, build_one_cst (type));
+  gimple_cilk_for_set_count (t, cfor);
+  gimple_cilk_for_set_grain (grain, cfor);
+}
+
 /* Gimplify the gross structure of an OMP_FOR statement.  */
 
 static enum gimplify_status
@@ -6559,7 +6579,15 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
   bool simd;
   bitmap has_decl_expr = NULL;
 
+  tree grain = NULL_TREE;
+  tree orig_init = NULL_TREE, orig_cond = NULL_TREE, orig_incr = NULL_TREE;
   orig_for_stmt = for_stmt = *expr_p;
+  
+  if (TREE_CODE (for_stmt) == CILK_FOR) 
+    { 
+      grain = OMP_FOR_CLAUSES (for_stmt);
+      OMP_FOR_CLAUSES (for_stmt) = NULL_TREE;
+    }
 
   simd = TREE_CODE (for_stmt) == OMP_SIMD
     || TREE_CODE (for_stmt) == CILK_SIMD;
@@ -6677,7 +6705,12 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	}
       else
 	var = decl;
-
+  
+      /* Original initial, final and increment values are necessary to compute
+	 the loop-count.  Otherwise, they are stored in variables and their
+	 context could be changed, potentially making it impossible to compute
+	 them correctly.  */
+      orig_init = TREE_OPERAND (t, 1);
       tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 			    is_gimple_val, fb_rvalue);
       ret = MIN (ret, tret);
@@ -6689,6 +6722,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gcc_assert (COMPARISON_CLASS_P (t));
       gcc_assert (TREE_OPERAND (t, 0) == decl);
 
+      orig_cond = TREE_OPERAND (t, 1);
       tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 			    is_gimple_val, fb_rvalue);
       ret = MIN (ret, tret);
@@ -6713,6 +6747,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	    t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	    t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	    TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	    orig_incr = build_one_cst (TREE_TYPE (t));
 	    break;
 	  }
 
@@ -6726,6 +6761,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	  orig_incr = build_one_cst (TREE_TYPE (t));
 	  break;
 
 	case MODIFY_EXPR:
@@ -6753,8 +6789,16 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	      gcc_unreachable ();
 	    }
 
-	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-				is_gimple_val, fb_rvalue);
+	  orig_incr = TREE_OPERAND (t, 1);
+	  /* Right here we are just trying to extract the absolute
+	     value of the increment.  */
+	  if (TREE_CODE (t) == MINUS_EXPR
+	      || TREE_CODE  (TREE_OPERAND (t, 1)) == NEGATE_EXPR
+	      || (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST
+		  && tree_int_cst_sgn (TREE_OPERAND (t, 1)) < 1))
+	    orig_incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), orig_incr);
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), pre_p,
+				NULL, is_gimple_val, fb_rvalue);
 	  ret = MIN (ret, tret);
 	  if (c)
 	    {
@@ -6825,6 +6869,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break;
     case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break;
     case CILK_SIMD: kind = GF_OMP_FOR_KIND_CILKSIMD; break;
+    case CILK_FOR: kind = GF_OMP_FOR_KIND_CILKFOR; break;
     case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break;
     default:
       gcc_unreachable ();
@@ -6859,6 +6904,10 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gimple_omp_for_set_incr (gfor, i, TREE_OPERAND (t, 1));
     }
 
+  if (kind == GF_OMP_FOR_KIND_CILKFOR)
+    cilk_for_compute_set_count_grain (gfor, orig_init, orig_cond, orig_incr, 
+				      grain);
+
   gimplify_seq_add_stmt (pre_p, gfor);
   if (ret != GS_ALL_DONE)
     return GS_ERROR;
@@ -7880,6 +7929,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	case OMP_FOR:
 	case OMP_SIMD:
 	case CILK_SIMD:
+	case CILK_FOR:
 	case OMP_DISTRIBUTE:
 	  ret = gimplify_omp_for (expr_p, pre_p);
 	  break;
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
old mode 100644
new mode 100755
index 97092dd..be75fde
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "tree-nested.h"
 #include "tree-eh.h"
+#include "cilk.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -198,6 +199,13 @@ struct omp_for_data
   struct omp_for_data_loop *loops;
 };
 
+/* A structure with necessary elements from _Cilk_for statement.  This
+   struct. node is passed in to WALK_STMT_INFO->INFO.  */
+typedef struct cilk_for_information {
+  bool found;
+  tree induction_var;
+  tree count;
+} cilk_for_info;
 
 static splay_tree all_contexts;
 static int taskreg_nesting_level;
@@ -314,6 +322,8 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
   fd->have_ordered = false;
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
   fd->chunk_size = NULL_TREE;
+  if (gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
+    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
   collapse_iter = NULL;
   collapse_count = NULL;
 
@@ -1820,29 +1830,125 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	scan_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx);
 }
 
-/* Create a new name for omp child function.  Returns an identifier.  */
+/* Create a new name for omp child function.  Returns an identifier.  If 
+   IS_CILK_FOR is true then the suffix for the child function is 
+   "_cilk_for_fn."  */
 
 static GTY(()) unsigned int tmp_ompfn_id_num;
 
 static tree
-create_omp_child_function_name (bool task_copy)
+create_omp_child_function_name (bool task_copy, bool is_cilk_for)
 {
+  if (is_cilk_for)
+    return clone_function_name (current_function_decl, "_cilk_for_fn");
   return (clone_function_name (current_function_decl,
 			       task_copy ? "_omp_cpyfn" : "_omp_fn"));
 }
 
+/* Helper function for walk_gimple_seq function.  *GSI_P is the gimple stmt.
+   iterator passed by walk_gimple_seq and *WI->INFO holds the CILK_FOR_INFO
+   structure.  This function sets the values inside this structure if it
+   finds a _Cilk_for statement in *GSI_P.  HANDLED_OPS_P is unused.  */
+
+static tree
+find_cilk_for_stmt (gimple_stmt_iterator *gsi_p,
+		    bool *handled_ops_p ATTRIBUTE_UNUSED,
+		    struct walk_stmt_info *wi)
+{
+  cilk_for_info *cf_info = (cilk_for_info *) wi->info;
+  gimple stmt = gsi_stmt (*gsi_p);
+
+  if (gimple_code (stmt) == GIMPLE_OMP_FOR
+      && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_CILKFOR)
+      /* For nested _Cilk_for statments, just look into the
+	 outer-most one.  */
+      && cf_info->found == false)
+    {
+      cf_info->found = true;
+      cf_info->induction_var = gimple_cilk_for_induction_var (stmt);
+      cf_info->count = gimple_cilk_for_loop_count (stmt);
+    }
+  return NULL_TREE;
+}
+
+/* Returns true if STMT contains a CILK_FOR statement.  If found then
+   populate *IND_VAR and *LOOP_COUNT with induction variable
+   and loop-count value.  Otherwise these values remain untouched.  
+   IND_VAR and LOOP_COUNT can be NULL and if so then they are also 
+   left untouched.  */
+
+static bool
+is_cilk_for_stmt (gimple stmt, tree *ind_var, tree *loop_count)
+{
+  if (!flag_enable_cilkplus)
+    return false;
+  if (gimple_code (stmt) == GIMPLE_OMP_PARALLEL)
+    stmt = gimple_omp_body (stmt);
+  if (gimple_code (stmt) == GIMPLE_BIND)
+    {
+      gimple_seq body = gimple_bind_body (stmt);
+      struct walk_stmt_info wi;
+      cilk_for_info cf_info;
+      memset (&cf_info, 0, sizeof (cilk_for_info));
+      memset (&wi, 0, sizeof (wi));
+      wi.info = &cf_info;
+      walk_gimple_seq (body, find_cilk_for_stmt, NULL, &wi);
+      if (cf_info.found)
+	{
+	  if (ind_var)
+	    *ind_var = cf_info.induction_var;
+	  if (loop_count)
+	    *loop_count = cf_info.count;
+	  return true;
+	}
+    }
+  return false;
+}
+
+/* Returns the type of the induction variable for the child function for
+   _Cilk_for and the types for _high and _low variables based on TYPE.  */
+
+static tree
+cilk_for_check_loop_diff_type (tree type)
+{
+  if (type == integer_type_node)
+    return type;
+  else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node))
+    { 
+      if (TYPE_UNSIGNED (type)) 
+	return uint32_type_node;
+      else
+	return integer_type_node;
+    }
+  else
+    {
+      if (TYPE_UNSIGNED (type)) 
+	return uint64_type_node;
+      else
+	return long_long_integer_type_node;
+    }
+  gcc_unreachable ();
+}
+
 /* Build a decl for the omp child function.  It'll not contain a body
    yet, just the bare decl.  */
 
 static void
 create_omp_child_function (omp_context *ctx, bool task_copy)
 {
-  tree decl, type, name, t;
+  tree decl, type, name, t, ind_var = NULL_TREE, loop_count = NULL_TREE;
 
-  name = create_omp_child_function_name (task_copy);
+  bool is_cilk_for = is_cilk_for_stmt (ctx->stmt, &ind_var, &loop_count);
+  tree cilk_var_type = (is_cilk_for ?
+    cilk_for_check_loop_diff_type (TREE_TYPE (ind_var)) : NULL_TREE);
+  
+  name = create_omp_child_function_name (task_copy, is_cilk_for);
   if (task_copy)
     type = build_function_type_list (void_type_node, ptr_type_node,
 				     ptr_type_node, NULL_TREE);
+  else if (is_cilk_for)
+    type = build_function_type_list (void_type_node, ptr_type_node,
+				     cilk_var_type, cilk_var_type, NULL_TREE);
   else
     type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
 
@@ -1892,13 +1998,44 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_CONTEXT (t) = decl;
   DECL_RESULT (decl) = t;
 
-  t = build_decl (DECL_SOURCE_LOCATION (decl),
-		  PARM_DECL, get_identifier (".omp_data_i"), ptr_type_node);
+  /* _Cilk_for's child function requires two extra parameters called 
+     __low and __high that are set the by Cilk runtime when it calls this 
+     function.  */
+  if (is_cilk_for)
+    {
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__high"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__low"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+    }
+
+  tree data_name = get_identifier (".omp_data_i");
+  t = build_decl (DECL_SOURCE_LOCATION (decl), PARM_DECL, data_name,
+		  ptr_type_node);
   DECL_ARTIFICIAL (t) = 1;
   DECL_NAMELESS (t) = 1;
   DECL_ARG_TYPE (t) = ptr_type_node;
   DECL_CONTEXT (t) = current_function_decl;
   TREE_USED (t) = 1;
+  if (is_cilk_for)
+    DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
   DECL_ARGUMENTS (decl) = t;
   if (!task_copy)
     ctx->receiver_decl = t;
@@ -4317,6 +4454,41 @@ expand_parallel_call (struct omp_region *region, basic_block bb,
 			    false, GSI_CONTINUE_LINKING);
 }
 
+/* Builds a function call using the values from WS_ARGS and data arguments 
+   of ENTRY_STMT.  The function call is inserted into BB.  */
+
+static void
+expand_cilk_for_call (basic_block bb, gimple entry_stmt,
+		      vec <tree, va_gc> *ws_args)
+{
+  tree t, t1, t2;
+  gimple_stmt_iterator gsi;
+  vec <tree, va_gc> *args;
+
+  /* The builtin function's name, the loop-count and the grain value are
+     stored in WS_ARGS.  */
+  tree func_name = (*ws_args)[0];
+  tree count = (*ws_args)[1];
+  tree grain = (*ws_args)[2];
+
+  gsi = gsi_last_bb (bb);
+  t = gimple_omp_parallel_data_arg (entry_stmt);
+  if (t == NULL)
+    t1 = null_pointer_node;
+  else
+    t1 = build_fold_addr_expr (t);
+  t2 = build_fold_addr_expr (gimple_omp_parallel_child_fn (entry_stmt));
+  
+  vec_alloc (args, 4);
+  args->quick_push (t2);
+  args->quick_push (t1);
+  args->quick_push (count);
+  args->quick_push (grain);
+  t = build_call_expr_loc_vec (UNKNOWN_LOCATION, func_name, args);
+
+  force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false, 
+			    GSI_CONTINUE_LINKING);
+}
 
 /* Build the function call to GOMP_task to actually
    generate the task operation.  BB is the block where to insert the code.  */
@@ -4652,8 +4824,39 @@ expand_omp_taskreg (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
+  /* The way _Cilk_for is constructed in this compiler can be thought of
+     as a parallel omp_for.  But the inner workings between them are very
+     different so we need a way to differenciate between them.  Thus, we
+     added a new schedule type called OMP_CLAUSE_SCHEDULE_CILKFOR, which 
+     pretty much says that this is not a parallel omp for but a _Cilk_for
+     statement.  */
+  bool is_cilk_for =
+    (flag_enable_cilkplus && region->inner &&
+     (region->inner->sched_kind == OMP_CLAUSE_SCHEDULE_CILKFOR));
+
+  /* Extract the __high and __low parameter from the function.  */
+  tree high_arg = NULL_TREE, low_arg = NULL_TREE;
+  if (is_cilk_for)
+    {
+      for (tree ii_arg = DECL_ARGUMENTS (child_fn); ii_arg != NULL_TREE;
+	   ii_arg = TREE_CHAIN (ii_arg))
+	{
+	  if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__high"))
+	    high_arg = ii_arg;
+	  if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__low"))
+	    low_arg = ii_arg;
+	}
+      gcc_assert (high_arg);
+      gcc_assert (low_arg);
+    }
+  
   if (is_combined_parallel (region))
     ws_args = region->ws_args;
+  else if (is_cilk_for) 
+    /* If it is a _Cilk_for statement, it is modelled *like* a parallel for,
+       so the inner statement should have the information that is required
+       to by expand_cilk_for_call.  */
+    ws_args = region->inner->ws_args;
   else
     ws_args = NULL;
 
@@ -4759,6 +4962,49 @@ expand_omp_taskreg (struct omp_region *region)
 	    }
 	}
 
+      /* In here the calls to the GET_NUM_THREADS and GET_THREAD_NUM are
+	 removed.  Further, they will be replaced by __low and __high
+	 parameter values.  */
+      gimple high_assign = NULL, low_assign = NULL;
+      if (is_cilk_for)
+	{
+	  gimple_stmt_iterator gsi2 = gsi_start_bb (single_succ (entry_bb));
+	  while (!gsi_end_p (gsi2))
+	    {
+	      gimple stmt = gsi_stmt (gsi2);
+	
+	      if (gimple_call_builtin_p (stmt, BUILT_IN_OMP_GET_NUM_THREADS))
+		{
+		  /* There can only be one one call to these two functions
+		     If there are multiple, then something went wrong
+		     somewhere.  */
+		  gcc_assert (low_assign == NULL);
+		  tree ltype = TREE_TYPE (gimple_get_lhs (stmt));
+		  tree tmp2 = create_tmp_reg (TREE_TYPE (low_arg), NULL);
+		  low_assign = gimple_build_assign 
+		    (gimple_get_lhs (stmt), fold_convert (ltype, tmp2));
+		  gsi_remove (&gsi2, true);
+		  gimple tmp_stmt = gimple_build_assign (tmp2, low_arg);
+		  gsi_insert_before (&gsi2, low_assign, GSI_NEW_STMT);
+		  gsi_insert_before (&gsi2, tmp_stmt, GSI_NEW_STMT);
+		}
+	      else if (gimple_call_builtin_p (stmt,
+					      BUILT_IN_OMP_GET_THREAD_NUM))
+		{
+		  gcc_assert (high_assign == NULL);
+		  tree htype = TREE_TYPE (gimple_get_lhs (stmt));
+		  tree tmp2 = create_tmp_reg (TREE_TYPE (high_arg), NULL);
+		  
+		  high_assign = gimple_build_assign 
+		    (gimple_get_lhs (stmt), fold_convert (htype, tmp2));
+		  gsi_remove (&gsi2, true);
+		  gimple tmp_stmt = gimple_build_assign (tmp2, high_arg);
+		  gsi_insert_before (&gsi2, high_assign, GSI_NEW_STMT);
+		  gsi_insert_before (&gsi2, tmp_stmt, GSI_NEW_STMT);
+		}
+	      gsi_next (&gsi2);
+	    }
+	}      
       /* Declare local variables needed in CHILD_CFUN.  */
       block = DECL_INITIAL (child_fn);
       BLOCK_VARS (block) = vec2chain (child_cfun->local_decls);
@@ -4821,6 +5067,13 @@ expand_omp_taskreg (struct omp_region *region)
       if (loops_state_satisfies_p (LOOPS_NEED_FIXUP))
 	child_cfun->x_current_loops->state |= LOOPS_NEED_FIXUP;
 
+      /* We expand it before it is customarily done for other flavors because
+	 the call to the function __cilkrts_cilk_for_64/32 (inserted by the 
+	 function below) may use some variables and thus the function call 
+	 must be inserted before the unwanted variables are eliminated.  */
+      if (is_cilk_for)
+	expand_cilk_for_call (new_bb, entry_stmt, ws_args);
+
       /* Remove non-local VAR_DECLs from child_cfun->local_decls list.  */
       num = vec_safe_length (child_cfun->local_decls);
       for (srcidx = 0, dstidx = 0; srcidx < num; srcidx++)
@@ -4834,7 +5087,7 @@ expand_omp_taskreg (struct omp_region *region)
 	}
       if (dstidx != num)
 	vec_safe_truncate (child_cfun->local_decls, dstidx);
-
+ 
       /* Inform the callgraph about the new function.  */
       DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
       cgraph_add_new_function (child_fn, true);
@@ -4865,11 +5118,14 @@ expand_omp_taskreg (struct omp_region *region)
       pop_cfun ();
     }
 
-  /* Emit a library call to launch the children threads.  */
-  if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
-    expand_parallel_call (region, new_bb, entry_stmt, ws_args);
-  else
-    expand_task_call (new_bb, entry_stmt);
+  if (!is_cilk_for)
+    {
+      /* Emit a library call to launch the children threads.  */
+      if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
+	expand_parallel_call (region, new_bb, entry_stmt, ws_args);
+      else
+	expand_task_call (new_bb, entry_stmt);
+    }
   if (gimple_in_ssa_p (cfun))
     update_ssa (TODO_update_ssa_only_virtuals);
 }
@@ -6544,6 +6800,225 @@ expand_omp_for_static_chunk (struct omp_region *region,
     }
 }
 
+/* A subroutine of expand_omp_for.  Generate code for _Cilk_for loop.  
+   Given parameters: 
+   for (V = N1; V cond N2; V += STEP) BODY; 
+   
+   where COND is "<" or ">", we generate pseudocode
+    
+   for (ind_var = low; ind_var < high; ind_var++)
+   {  
+      if (n1 < n2)
+	V = n1 + (ind_var * STEP)
+      else
+        V = n2 - (ind_var * STEP);
+
+      <BODY>
+    }  
+  
+    In the above pseudocode, low and high are function parameters of the
+    child function.  In the function below, we are inserting a temp.
+    variable that will be making a call to two OMP functions that will not be
+    found in the body of _Cilk_for (since OMP_FOR cannot be mixed 
+    with _Cilk_for).  These functions are replaced with low and high 
+    by the function that handleds taskreg.  */
+
+
+static void
+expand_cilk_for (struct omp_region *region, struct omp_for_data *fd)
+{
+  bool broken_loop = region->cont == NULL;
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  basic_block entry_bb = region->entry;
+  basic_block cont_bb = region->cont;
+  
+  gcc_assert (EDGE_COUNT (entry_bb->succs) == 2);
+  gcc_assert (broken_loop
+	      || BRANCH_EDGE (entry_bb)->dest == FALLTHRU_EDGE (cont_bb)->dest);
+  basic_block l0_bb = FALLTHRU_EDGE (entry_bb)->dest;
+  basic_block l1_bb, l2_bb;
+
+  tree grain = gimple_cilk_for_grain (fd->for_stmt);
+  if (!broken_loop)
+    {
+      gcc_assert (BRANCH_EDGE (cont_bb)->dest == l0_bb);
+      gcc_assert (EDGE_COUNT (cont_bb->succs) == 2);
+      l1_bb = split_block (cont_bb, last_stmt (cont_bb))->dest;
+      l2_bb = BRANCH_EDGE (entry_bb)->dest;
+    }
+  else
+    {
+      BRANCH_EDGE (entry_bb)->flags &= ~EDGE_ABNORMAL;
+      l1_bb = split_edge (BRANCH_EDGE (entry_bb));
+      l2_bb = single_succ (l1_bb);
+    }
+  basic_block exit_bb = region->exit;
+  basic_block l2_dom_bb = NULL;
+
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+
+  /* Below statements until the "tree high_val = ..." are pseudo statements 
+     used to pass information to be used by expand_omp_taskreg.
+     low_val and high_val will be replaced by the __low and __high
+     parameter from the child function.
+
+     The call_exprs part is a place-holder, it is mainly used 
+     to distinctly identify to the top-level part that this is
+     where we should put low and high (reasoning given in header 
+     comment).  */
+
+  tree t = build_call_expr
+    (builtin_decl_explicit (BUILT_IN_OMP_GET_NUM_THREADS), 0);
+  t = fold_convert (type, t);
+  tree low_val = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+					   GSI_SAME_STMT);
+  t = build_call_expr (builtin_decl_explicit (BUILT_IN_OMP_GET_THREAD_NUM),
+		       0);
+  t = fold_convert (TREE_TYPE (fd->loop.v), t);
+  tree high_val = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+					   GSI_SAME_STMT);
+
+  tree ind_var = create_tmp_reg (type, "__cilk_ind_var");
+  gcc_assert (gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_FOR);
+  
+  /* Not needed in SSA form right now.  */
+  gcc_assert (!gimple_in_ssa_p (cfun));
+  if (l2_dom_bb == NULL)
+    l2_dom_bb = l1_bb;
+
+  tree n1 = low_val;
+  tree n2 = high_val;
+  
+  expand_omp_build_assign (&gsi, ind_var, n1);
+
+  /* Remove the GIMPLE_OMP_FOR statement.  */
+  gsi_remove (&gsi, true);
+
+  gimple stmt;
+  if (!broken_loop)
+    {
+      /* Code to control the increment goes in the CONT_BB.  */
+      gsi = gsi_last_bb (cont_bb);
+      stmt = gsi_stmt (gsi);
+      gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE);
+      enum tree_code code = PLUS_EXPR;
+      if (POINTER_TYPE_P (type))
+	t = fold_build_pointer_plus (ind_var, build_one_cst (type)); 
+      else
+	t = fold_build2 (code, type, ind_var, build_one_cst (type));
+      expand_omp_build_assign (&gsi, ind_var, t);
+
+      /* Remove GIMPLE_OMP_CONTINUE.  */
+      gsi_remove (&gsi, true);
+    }
+
+  /* Emit the condition in L1_BB.  */
+  gsi = gsi_start_bb (l1_bb);
+
+  tree step = fold_convert (type, fd->loop.step);
+  if ((TREE_CODE (step) == INTEGER_CST && tree_int_cst_sgn (step) < 1)) 
+    step = fold_build1_loc (UNKNOWN_LOCATION, NEGATE_EXPR, type, step);
+  
+  t = build2 (MULT_EXPR, type, ind_var, step);
+  tree tmp = create_tmp_reg (type, NULL);
+  gsi_insert_after (&gsi, gimple_build_assign (tmp, t), GSI_NEW_STMT);
+
+  tree tmp2 = create_tmp_reg (type, NULL);
+  tree cvtd = fold_convert (type, fd->loop.n1);
+  gsi_insert_after (&gsi, gimple_build_assign (tmp2, cvtd), GSI_NEW_STMT);
+  
+  if (fd->loop.cond_code == GE_EXPR || fd->loop.cond_code == GT_EXPR)
+    t = fold_build2 (MINUS_EXPR, type, tmp2, tmp);
+ else
+   t = fold_build2 (PLUS_EXPR, type, tmp2, tmp);
+
+  tmp = create_tmp_reg (type, NULL);
+  gsi_insert_after (&gsi, gimple_build_assign (tmp, t), GSI_NEW_STMT);
+
+  cvtd = fold_convert (TREE_TYPE (fd->loop.v), tmp);
+  gsi_insert_after (&gsi, gimple_build_assign (fd->loop.v, cvtd), 
+		    GSI_NEW_STMT);
+  
+  t = fold_convert (type, n2);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+				false, GSI_CONTINUE_LINKING);
+  /* The condition is always '<' since the runtime will fill in the low
+     and high values.  */
+  t = build2 (LT_EXPR, boolean_type_node, ind_var, t);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p,
+		 NULL, NULL)
+      || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p,
+		    NULL, NULL))
+    {
+      gsi = gsi_for_stmt (stmt);
+      gimple_regimplify_operands (stmt, &gsi);
+    }
+
+  /* Remove GIMPLE_OMP_RETURN.  */
+  gsi = gsi_last_bb (exit_bb);
+  gsi_remove (&gsi, true);
+
+  /* Connect the new blocks.  */
+  remove_edge (FALLTHRU_EDGE (entry_bb));
+
+  edge e, ne;
+  if (!broken_loop)
+    {
+      remove_edge (BRANCH_EDGE (entry_bb));
+      make_edge (entry_bb, l1_bb, EDGE_FALLTHRU);
+
+      e = BRANCH_EDGE (l1_bb);
+      ne = FALLTHRU_EDGE (l1_bb);
+      e->flags = EDGE_TRUE_VALUE;
+    }
+  else
+    {
+      single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
+
+      ne = single_succ_edge (l1_bb);
+      e = make_edge (l1_bb, l0_bb, EDGE_TRUE_VALUE);
+
+    }
+  ne->flags = EDGE_FALSE_VALUE;
+  e->probability = REG_BR_PROB_BASE * 7 / 8;
+  ne->probability = REG_BR_PROB_BASE / 8;
+
+  set_immediate_dominator (CDI_DOMINATORS, l1_bb, entry_bb);
+  set_immediate_dominator (CDI_DOMINATORS, l2_bb, l2_dom_bb);
+  set_immediate_dominator (CDI_DOMINATORS, l0_bb, l1_bb);
+
+  if (!broken_loop)
+    {
+      struct loop *loop = alloc_loop ();
+      loop->header = l1_bb;
+      loop->latch = cont_bb;
+      add_loop (loop, l1_bb->loop_father);
+      loop->safelen = INT_MAX;
+    }
+
+  /* Pick the correct library function based on the precision of the
+     induction variable type.  */
+  tree lib_fun = NULL_TREE;
+  if (TYPE_PRECISION (type) == 32)
+    lib_fun = cilk_for_32_fndecl;
+  else if (TYPE_PRECISION (type) == 64)
+    lib_fun = cilk_for_64_fndecl;
+  else
+    gcc_unreachable ();
+  
+  tree count = gimple_cilk_for_loop_count (fd->for_stmt);
+  gcc_assert (count != NULL_TREE);
+  
+  /* ws_args contains three information: The library function flavor to call
+     (__cilkrts_cilk_for_32/__cilkrts_cilk_for_64) loop_count and the grain 
+     value.  */
+  vec_alloc (region->ws_args, 3);
+  region->ws_args->quick_push (lib_fun);
+  region->ws_args->quick_push (count);
+  region->ws_args->quick_push (grain);
+}
 
 /* A subroutine of expand_omp_for.  Generate code for a simd non-worksharing
    loop.  Given parameters:
@@ -6884,6 +7359,8 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt)
 
   if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_SIMD)
     expand_omp_simd (region, &fd);
+  else if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_CILKFOR)
+    expand_cilk_for (region, &fd);
   else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
 	   && !fd.have_ordered)
     {
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
new file mode 100644
index 0000000..3f68022
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
@@ -0,0 +1,100 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+static void check (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start;  ii < end; ii = ii + incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+static void check_reverse (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start; ii >= end; ii = ii - incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+
+int main (void)
+{
+  int Array[10];
+  int x = 9, y = 0, z = 3;
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 1133;
+  check (Array, 0, 10, 1, 1133);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 9; ii > -1; ii--)
+    Array[ii] = 4433;
+  check_reverse (Array, 9, 0, 1, 4433);
+
+  _Cilk_for (int ii = 9; ii > -1; --ii)
+    Array[ii] = 9988;
+  check_reverse (Array, 9, 0, 1, 9988);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    Array[ii] = 1328;
+  check (Array, 0, 10, 2, 1328);
+
+  _Cilk_for (int ii = 9; ii >= 0; ii -= 2)
+    Array[ii] = 1738;
+  check_reverse (Array, 9, 0, 2, 1738);
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if (ii % 2)
+	Array[ii] = 1343;
+      else
+	Array[ii] = 3413;
+    }
+
+  check (Array, 1, 10, 2, 1343); 
+  check (Array, 0, 10, 2, 3413); 
+
+  _Cilk_for (short cc = 0; cc < 10; cc++) 
+    Array[cc] = 1343;
+  check (Array, 0, 10,  1,1343);
+
+  _Cilk_for (short cc = 9; cc >= 0; cc--)
+    Array[cc] = 1348;
+  check_reverse (Array, 9, 0, 1, 1348);
+
+
+
+  /* Loop with polynomials in the start, final and incr.  */
+  _Cilk_for (int ii = z - 3; ii <= z * 3; ii += z-1)
+    { 
+      Array[ii] = 3233;
+    }
+
+  for (int ii = z-3; ii <= z*3; ii += (z-1))
+    if (Array[ii] != 3233)
+      __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
new file mode 100644
index 0000000..0ebc09a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+int main (void)
+{
+  int q = 0, ii = 0, jj = 0;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+
+  _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 5;
+
+  _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
+    q = 5;
+
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static variable" } */
+    q = 5;
+
+  _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */
+    q = 5;
+
+  _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
+    q = 5;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
new file mode 100644
index 0000000..6cb9b03
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
@@ -0,0 +1,35 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int grain_value = 2;
+int main (void)
+{
+  int Array1[200], Array1_Serial[200];
+
+  for (int ii = 0; ii < 200; ii++)
+    {
+      Array1_Serial[ii] = 2;
+      Array1[ii] = 1;
+    }
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 200; ii++)
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+#pragma cilk grainsize = grain_value
+  _Cilk_for (int ii = 0; ii < 200; ii++) 
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
new file mode 100644
index 0000000..ff8bc0a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+char Array1[26];
+
+#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */
+
+int main(int argc, char **argv)
+{
+/* This is OK.  */
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+
+/* This is OK, it will do a type conversion to long int.  */
+#pragma cilk grainsize = 0.5 
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize = 1 
+  while (Array1[5] != 0) /* { dg-warning "grainsize pragma is not followed" } */
+    {
+    /* Blah */
+    }
+
+#pragma cilk grainsize = 1 
+  int q = 0; /* { dg-warning "grainsize pragma is not followed" } */
+  _Cilk_for (q = 0; q < 10; q++)
+    Array1[q]  = 5;
+
+  while (Array1[5] != 0)
+    {
+    /* Blah */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
new file mode 100644
index 0000000..7a779f7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
@@ -0,0 +1,41 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+/* <feature> loop control variable must have integer, pointer or class type
+   </feature>
+*/
+
+#define ARRAY_SIZE 10000
+int a[ARRAY_SIZE];
+
+int main(void)
+{ 
+  int ii = 0;
+
+#if 1
+  for (ii =0; ii < ARRAY_SIZE; ii++)
+    a[ii] = 5;
+#endif
+  _Cilk_for(int *aa = a; aa < a + ARRAY_SIZE; aa++) 
+    *aa = 0;
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii++) 
+    if (a[ii] != 0) 
+      __builtin_abort ();
+#endif
+
+  _Cilk_for (int *aa = a; aa < a + ARRAY_SIZE; aa = aa + 2)
+    *aa = 4;
+
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii = ii + 2) 
+    if (a[ii] != 4) 
+      __builtin_abort ();
+#endif
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
new file mode 100644
index 0000000..cffe17e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
@@ -0,0 +1,79 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+int main (void)
+{
+  int Array[10][10];
+
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj++)
+	{
+	  Array[ii][jj] = 0;
+	}
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 5; jj++)
+      Array[ii][jj] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 5; jj++)
+      if (Array[ii][jj] != 5)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+
+  /* One goes up and one goes down.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 9; jj >= 0; jj--)
+      Array[ii][jj] = 7;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 9; jj >= 0; jj--)
+      if (Array[ii][jj] != 7)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 10; jj += 2)
+      Array[ii][jj] = 9;
+  
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj += 2)
+      if (Array[ii][jj] != 9)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    _Cilk_for (int jj = 5; jj < 9; jj++)
+      Array[ii][jj] = 11; 
+  
+  for (int ii = 0; ii < 10; ii += 2)
+    for (int jj = 5; jj < 9; jj++)
+      if (Array[ii][jj] != 11)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 0822d35..9412bab 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -351,6 +351,7 @@ enum omp_clause_schedule_kind {
   OMP_CLAUSE_SCHEDULE_GUIDED,
   OMP_CLAUSE_SCHEDULE_AUTO,
   OMP_CLAUSE_SCHEDULE_RUNTIME,
+  OMP_CLAUSE_SCHEDULE_CILKFOR,
   OMP_CLAUSE_SCHEDULE_LAST
 };
 
diff --git a/gcc/tree.def b/gcc/tree.def
index 364e510..0a32bc4 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1051,6 +1051,10 @@ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6)
    Operands like for OMP_FOR.  */
 DEFTREECODE (CILK_SIMD, "cilk_simd", tcc_statement, 6)
 
+/* Cilk Plus - _Cilk_for (..)
+   Operands like for OMP_FOR.  */
+DEFTREECODE (CILK_FOR, "cilk_for", tcc_statement, 6)
+
 /* OpenMP - #pragma omp distribute [clause1 ... clauseN]
    Operands like for OMP_FOR.  */
 DEFTREECODE (OMP_DISTRIBUTE, "omp_distribute", tcc_statement, 6)

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2013-12-16 21:21                       ` Jason Merrill
  2013-12-16 23:41                         ` Iyer, Balaji V
  2013-12-18  0:22                         ` Iyer, Balaji V
@ 2014-01-06 22:29                         ` Iyer, Balaji V
  2 siblings, 0 replies; 42+ messages in thread
From: Iyer, Balaji V @ 2014-01-06 22:29 UTC (permalink / raw)
  To: Jason Merrill, 'Jeff Law', 'Aldy Hernandez'
  Cc: 'gcc-patches@gcc.gnu.org', 'rth@redhat.com',
	'Jakub Jelinek'

[-- Attachment #1: Type: text/plain, Size: 1565 bytes --]

Hi Everyone,
	Attached, please find two patches: 1 for _Cilk_for for C   and 1 for C++, and changelog entries for them. It is written on top of the omp framework and uses its trees and functions as requested by Jason and Jeff. It does not interfere with any other tests (i.e. pass things that are failing and fail things that are passing). It is tested and works on x86 and x86_64.

Is this Ok for trunk?

Thanks,

Balaji V. Iyer.

> -----Original Message-----
> From: Jason Merrill [mailto:jason@redhat.com]
> Sent: Monday, December 16, 2013 4:22 PM
> To: Iyer, Balaji V; 'Jeff Law'; 'Aldy Hernandez'
> Cc: 'gcc-patches@gcc.gnu.org'; 'rth@redhat.com'; 'Jakub Jelinek'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On 12/15/2013 07:40 PM, Iyer, Balaji V wrote:
> > -		       tree clauses, tree *cclauses)
> > +		       tree clauses_or_grain, tree *cclauses)
> 
> Instead of this, please make the grainsize a new type of clause.
> 
> > -  return (gimple_omp_subcode (g) & GF_OMP_FOR_COMBINED) != 0;
> > +  return (gimple_omp_for_kind (g) == GF_OMP_FOR_COMBINED);
> 
> I don't really know this code, but this change seems unlikely to be correct.
> Can you explain it?
> 
> > +  tree data_name = get_identifier (".omp_data_i");  if (is_cilk_for)
> > +    data_name = get_identifier (".cilk_for_data_i");
> 
> Why does the name of an artificial parameter matter?
> 
> >  }
> > +/* A subroutine of expand_omp_for.  Generate code for _Cilk_for loop.
> > +   Given parameters:
> 
> Need a blank line after the }.
> 
> Jason


[-- Attachment #2: diff_c.txt --]
[-- Type: text/plain, Size: 64293 bytes --]

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 40d12bc..9d24691
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -414,6 +414,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Complex",		RID_COMPLEX,	0 },
   { "_Cilk_spawn",      RID_CILK_SPAWN, 0 },
   { "_Cilk_sync",       RID_CILK_SYNC,  0 },
+  { "_Cilk_for",        RID_CILK_FOR,   0 },
   { "_Imaginary",	RID_IMAGINARY, D_CONLY },
   { "_Decimal32",       RID_DFLOAT32,  D_CONLY | D_EXT },
   { "_Decimal64",       RID_DFLOAT64,  D_CONLY | D_EXT },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7e3ece6..0eaebf3
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -149,7 +149,7 @@ enum rid
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
   /* Cilk Plus keywords.  */
-  RID_CILK_SPAWN, RID_CILK_SYNC,
+  RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR,
   
   /* Objective-C ("AT" reserved words - they are only keywords when
      they follow '@')  */
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index ac380ee..b15cd4c
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -386,7 +386,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
   bool fail = false;
   int i;
 
-  if (code == CILK_SIMD
+  if ((code == CILK_SIMD || code == CILK_FOR) 
       && !c_check_cilk_loop (locus, TREE_VEC_ELT (declv, 0)))
     fail = true;
 
@@ -516,7 +516,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 					    0))
 		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
 					 ? LT_EXPR : GE_EXPR);
-		  else if (code != CILK_SIMD)
+		  else if (code != CILK_SIMD && code != CILK_FOR)
 		    cond_ok = false;
 		}
 	    }
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index af28085..6f22148
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1394,6 +1394,11 @@ init_pragma (void)
 
   cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false,
 				false);
+
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    cpp_register_deferred_pragma (parse_in, "cilk", "grainsize",
+				  PRAGMA_CILK_GRAINSIZE, true, false);
+
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, "pack", handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 6f1bf74..b9f09ba
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -55,6 +55,9 @@ typedef enum pragma_kind {
   /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
   PRAGMA_CILK_SIMD,
 
+  /* This pragma handles setting of grainsize for a _Cilk_for.  */
+  PRAGMA_CILK_GRAINSIZE,
+
   PRAGMA_GCC_PCH_PREPROCESS,
   PRAGMA_IVDEP,
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index f73df08..358e669
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1248,10 +1248,11 @@ static bool c_parser_objc_diagnose_bad_element_prefix
   (c_parser *, struct c_declspecs *);
 
 /* Cilk Plus supporting routines.  */
-static void c_parser_cilk_simd (c_parser *);
+static void c_parser_cilk_simd (c_parser *, tree);
 static bool c_parser_cilk_verify_simd (c_parser *, enum pragma_context);
 static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
 static tree c_parser_cilk_clause_vectorlength (c_parser *, tree, bool);
+static void c_parser_cilk_grainsize (c_parser *);
 
 /* Parse a translation unit (C90 6.7, C99 6.9).
 
@@ -4878,6 +4879,16 @@ c_parser_statement_after_labels (c_parser *parser)
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
 	  break;
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (c_parser_peek_token (parser)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      c_parser_skip_to_end_of_block_or_statement (parser);
+	    }
+	  else
+	    c_parser_cilk_simd (parser, integer_zero_node);
+	  break;
 	case RID_CILK_SYNC:
 	  c_parser_consume_token (parser);
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
@@ -9488,7 +9499,25 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       if (!c_parser_cilk_verify_simd (parser, context))
 	return false;
       c_parser_consume_pragma (parser);
-      c_parser_cilk_simd (parser);
+      c_parser_cilk_simd (parser, NULL_TREE);
+      return false;
+
+    case PRAGMA_CILK_GRAINSIZE:
+      if (!flag_enable_cilkplus)
+	{
+	  warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not"
+		   " enabled");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      if (context == pragma_external)
+	{
+	  error_at (c_parser_peek_token (parser)->location,
+		    "%<#pragma grainsize%> must be inside a function");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      c_parser_cilk_grainsize (parser);
       return false;
 
     default:
@@ -11583,7 +11612,7 @@ c_parser_omp_flush (c_parser *parser)
 
 static tree
 c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
-		       tree clauses, tree *cclauses)
+		       tree clauses_or_grain, tree *cclauses)
 {
   tree decl, cond, incr, save_break, save_cont, body, init, stmt, cl;
   tree declv, condv, incrv, initv, ret = NULL;
@@ -11591,6 +11620,9 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   int i, collapse = 1, nbraces = 0;
   location_t for_loc;
   vec<tree, va_gc> *for_block = make_tree_vector ();
+  tree clauses = code == CILK_FOR ? NULL_TREE : clauses_or_grain;
+  tree grain = code == CILK_FOR ? clauses_or_grain : NULL_TREE;
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
 
   for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl))
     if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE)
@@ -11603,11 +11635,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   condv = make_tree_vec (collapse);
   incrv = make_tree_vec (collapse);
 
-  if (!c_parser_next_token_is_keyword (parser, RID_FOR))
+  if (code != CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_FOR))
     {
       c_parser_error (parser, "for statement expected");
       return NULL;
     }
+  if (code == CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_CILK_FOR))
+    {
+      c_parser_error (parser, "_Cilk_for statement expected");
+      return NULL;
+    }
   for_loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
 
@@ -11685,7 +11724,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	    case LE_EXPR:
 	      break;
 	    case NE_EXPR:
-	      if (code == CILK_SIMD)
+	      if (code == CILK_SIMD || code == CILK_FOR)
 		break;
 	      /* FALLTHRU.  */
 	    default:
@@ -11767,6 +11806,12 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
     c_break_label = size_one_node;
   save_cont = c_cont_label;
   c_cont_label = NULL_TREE;
+
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = c_begin_omp_parallel ();
+    }
   body = push_stmt_list ();
 
   if (open_brace_parsed)
@@ -11814,6 +11859,13 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	}
     }
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = c_finish_omp_parallel (loc, NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
+
   /* Only bother calling c_finish_omp_for if we haven't already generated
      an error from the initialization parsing.  */
   if (!fail)
@@ -11859,6 +11911,11 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 		  }
 	    }
 	  OMP_FOR_CLAUSES (stmt) = clauses;
+	  /* If it is a _Cilk_for statement, then the OMP_FOR_CLAUSES location
+	     stores the user-defined grain value or an integer_zero_node 
+	     indicating that the runtime must compute a suitable grain.  */
+	  if (code == CILK_FOR)
+	    OMP_FOR_CLAUSES (stmt) = grain;
 	}
       ret = stmt;
     }
@@ -13762,16 +13819,65 @@ c_parser_cilk_all_clauses (c_parser *parser)
   return c_finish_cilk_clauses (clauses);
 }
 
+/* This function helps parse the grainsize pragma for a _Cilk_for statement. 
+   Here is the correct syntax of this pragma: 
+	    #pragma cilk grainsize = <EXP>  */
+
+static void
+c_parser_cilk_grainsize (c_parser *parser)
+{
+  extern tree convert_to_integer (tree, tree);
+
+  /* consume the 'grainsize' keyword.  */
+  c_parser_consume_pragma (parser);
+
+  if (c_parser_require (parser, CPP_EQ, "expected %<=%>") != 0)
+    {
+      struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL);
+      if (g_expr.value && TREE_CODE (g_expr.value) == C_MAYBE_CONST_EXPR)
+	{
+	  error_at (input_location, "cannot convert grain to long integer.\n");
+	  c_parser_skip_to_pragma_eol (parser);
+	}   
+      else if (g_expr.value && g_expr.value != error_mark_node)
+	{
+	  c_parser_skip_to_pragma_eol (parser);
+	  c_token *token = c_parser_peek_token (parser);
+	  if (token && token->type == CPP_KEYWORD
+	      && token->keyword == RID_CILK_FOR)
+	    {
+	      tree grain = convert_to_integer (long_integer_type_node,
+					       g_expr.value);
+	      if (grain && grain != error_mark_node) 
+		c_parser_cilk_simd (parser, grain);
+	    }
+	  else
+	    warning (0, "grainsize pragma is not followed by %<_Cilk_for%>");
+	}
+      else
+	c_parser_skip_to_pragma_eol (parser);
+    }
+  else
+    c_parser_skip_to_pragma_eol (parser);
+}
+
 /* Main entry point for parsing Cilk Plus <#pragma simd> for
    loops.  */
 
 static void
-c_parser_cilk_simd (c_parser *parser)
+c_parser_cilk_simd (c_parser *parser, tree grain)
 {
-  tree clauses = c_parser_cilk_all_clauses (parser);
+  bool is_cilk_for = grain == NULL_TREE ? false : true;
+  tree clauses = NULL_TREE;
+  
+  if (!is_cilk_for)
+    clauses = c_parser_cilk_all_clauses (parser);
+  else 
+    clauses = grain;
   tree block = c_begin_compound_stmt (true);
   location_t loc = c_parser_peek_token (parser)->location;
-  c_parser_omp_for_loop (loc, parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  c_parser_omp_for_loop (loc, parser, code, clauses, NULL);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 }
diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def
index 9f3240a..bf319d5
--- a/gcc/cilk-builtins.def
+++ b/gcc/cilk-builtins.def
@@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64")
diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c
index afe88c9..bc1092b
--- a/gcc/cilk-common.c
+++ b/gcc/cilk-common.c
@@ -106,6 +106,26 @@ install_builtin (const char *name, tree fntype, enum built_in_function code,
   return fndecl;
 }
 
+/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
+
+static tree
+cilk_declare_looper (const char *name, tree type, enum built_in_function code)
+{
+  tree cb, ft, fn;
+
+  cb = build_function_type_list (void_type_node,
+                                 ptr_type_node, type, type,
+                                 NULL_TREE);
+  cb = build_pointer_type (cb);
+  ft = build_function_type_list (void_type_node,
+                                 cb, ptr_type_node, type,
+                                 integer_type_node, NULL_TREE);
+  fn = install_builtin (name, ft, code, false);
+  TREE_NOTHROW (fn) = 0;
+
+  return fn;
+}
+
 /* Creates and initializes all the built-in Cilk keywords functions and three
    structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker.
    Detailed information about __cilkrts_stack_frame and
@@ -269,6 +289,15 @@ cilk_init_builtins (void)
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 
 					 fptr_fun, BUILT_IN_CILK_SAVE_FP,
 					 false);
+  /* __cilkrts_cilk_for_32 (...);  */
+  cilk_for_32_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_32",
+                                            unsigned_intSI_type_node,
+                                            BUILT_IN_CILK_FOR_32);
+  /* __cilkrts_cilk_for_64 (...);  */
+  cilk_for_64_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_64",
+                                            unsigned_intDI_type_node,
+                                            BUILT_IN_CILK_FOR_64);
+
 }
 
 /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR.  */
diff --git a/gcc/cilk.h b/gcc/cilk.h
index d2ae931..0e98998
--- a/gcc/cilk.h
+++ b/gcc/cilk.h
@@ -40,6 +40,9 @@ enum cilk_tree_index  {
   CILK_TI_F_POP,                      /* __cilkrts_pop_frame (...).  */
   CILK_TI_F_RETHROW,                  /* __cilkrts_rethrow (...).  */
   CILK_TI_F_SAVE_FP,                  /* __cilkrts_save_fp_ctrl_state (...).  */
+  CILK_TI_F_LOOP_32,                  /* __cilkrts_cilk_for_32 (...).  */
+  CILK_TI_F_LOOP_64,                  /* __cilkrts_cilk_for_64 (...).  */
+
   /* __cilkrts_stack_frame struct fields.  */
   CILK_TI_FRAME_FLAGS,                /* stack_frame->flags.  */
   CILK_TI_FRAME_PARENT,               /* stack_frame->parent.  */
@@ -77,6 +80,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 #define cilk_rethrow_fndecl           cilk_trees[CILK_TI_F_RETHROW]
 #define cilk_pop_fndecl               cilk_trees[CILK_TI_F_POP]
 #define cilk_save_fp_fndecl           cilk_trees[CILK_TI_F_SAVE_FP]
+#define cilk_for_32_fndecl            cilk_trees[CILK_TI_F_LOOP_32]
+#define cilk_for_64_fndecl            cilk_trees[CILK_TI_F_LOOP_64]
 
 #define cilk_worker_type_fndecl       cilk_trees[CILK_TI_WORKER_TYPE]
 #define cilk_frame_type_decl          cilk_trees[CILK_TI_FRAME_TYPE]
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 2d1e1c7..1e7bebf 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1158,6 +1158,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_DISTRIBUTE:
 	  pp_string (buffer, "#pragma omp distribute");
 	  break;
+	case GF_OMP_FOR_KIND_CILKFOR:
+	  gcc_assert (flag_enable_cilkplus);
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -1167,7 +1170,11 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	  if (i)
 	    spc += 2;
 	  newline_and_indent (buffer, spc);
-	  pp_string (buffer, "for (");
+	  if (flag_enable_cilkplus 
+	      && gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
+	    pp_string (buffer, "_Cilk_for (");
+	  else
+	    pp_string (buffer, "for (");
 	  dump_generic_node (buffer, gimple_omp_for_index (gs, i), spc,
 			     flags, false);
 	  pp_string (buffer, " = ");
@@ -1192,6 +1199,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	    case GE_EXPR:
 	      pp_greater_equal (buffer);
 	      break;
+	    case NE_EXPR:
+	      pp_string (buffer, "!=");
+	      break;
 	    default:
 	      gcc_unreachable ();
 	    }
diff --git a/gcc/gimple.h b/gcc/gimple.h
index df92863..42304fd
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -91,13 +91,14 @@ enum gf_mask {
     GF_CALL_ALLOCA_FOR_VAR	= 1 << 5,
     GF_CALL_INTERNAL		= 1 << 6,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
-    GF_OMP_FOR_KIND_MASK	= 3 << 0,
+    GF_OMP_FOR_KIND_MASK	= 7 << 0,
     GF_OMP_FOR_KIND_FOR		= 0 << 0,
     GF_OMP_FOR_KIND_DISTRIBUTE	= 1 << 0,
     GF_OMP_FOR_KIND_SIMD	= 2 << 0,
     GF_OMP_FOR_KIND_CILKSIMD	= 3 << 0,
-    GF_OMP_FOR_COMBINED		= 1 << 2,
-    GF_OMP_FOR_COMBINED_INTO	= 1 << 3,
+    GF_OMP_FOR_KIND_CILKFOR     = 4 << 0,
+    GF_OMP_FOR_COMBINED		= 1 << 3,
+    GF_OMP_FOR_COMBINED_INTO	= 1 << 4,
     GF_OMP_TARGET_KIND_MASK	= 3 << 0,
     GF_OMP_TARGET_KIND_REGION	= 0 << 0,
     GF_OMP_TARGET_KIND_DATA	= 1 << 0,
@@ -523,6 +524,9 @@ struct GTY(()) gimple_omp_for_iter {
 
   /* Increment.  */
   tree incr;
+
+  /* Grain value, only used by _Cilk_for.  */
+  tree grain;
 };
 
 /* GIMPLE_OMP_FOR */
@@ -4562,6 +4566,37 @@ gimple_omp_for_set_pre_body (gimple gs, gimple_seq pre_body)
   omp_for_stmt->pre_body = pre_body;
 }
 
+/* Set GRAIN to be the grain value used by Cilk runtime for OMP_FOR GS.  */
+
+static inline void
+gimple_cilk_for_set_grain (tree grain, gimple gs)
+{
+  const gimple_statement_omp_for *omp_for_stmt =
+    as_a <gimple_statement_omp_for> (gs);
+  omp_for_stmt->iter[0].grain = grain;
+}
+
+/* Returns the induction variable of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_induction_var (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->index;
+}
+
+/* Returns the GRAIN value of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_grain (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->grain;
+}
 
 /* Return the clauses associated with OMP_PARALLEL GS.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index a6e0c75..ab6088c
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6559,7 +6559,18 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
   bool simd;
   bitmap has_decl_expr = NULL;
 
+  tree grain = NULL_TREE;
+  tree orig_init = NULL_TREE, orig_cond = NULL_TREE, orig_incr = NULL_TREE;
   orig_for_stmt = for_stmt = *expr_p;
+  
+  if (TREE_CODE (for_stmt) == CILK_FOR) 
+    { 
+      /* The user cannot pass any clauses for _Cilk_for, thus the grain value
+	 is stored in the same location as the clauses to utilize
+	 the unused space.  */
+      grain = OMP_FOR_CLAUSES (for_stmt);
+      OMP_FOR_CLAUSES (for_stmt) = NULL_TREE;
+    }
 
   simd = TREE_CODE (for_stmt) == OMP_SIMD
     || TREE_CODE (for_stmt) == CILK_SIMD;
@@ -6603,6 +6614,11 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     }
 
   for_body = NULL;
+  if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
+    {
+      tree it = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), 0);
+      gimplify_and_add (it, &for_pre_body);
+    }
   gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
 	      == TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
   gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
@@ -6677,7 +6693,12 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	}
       else
 	var = decl;
-
+ 
+      /* Original initial, final and increment values are necessary to compute
+	 the loop-count.  Otherwise, they are stored in variables and their
+	 context could be changed, potentially making it impossible to compute
+	 them correctly.  */
+      orig_init = TREE_OPERAND (t, 1);
       tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 			    is_gimple_val, fb_rvalue);
       ret = MIN (ret, tret);
@@ -6689,10 +6710,18 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gcc_assert (COMPARISON_CLASS_P (t));
       gcc_assert (TREE_OPERAND (t, 0) == decl);
 
-      tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-			    is_gimple_val, fb_rvalue);
-      ret = MIN (ret, tret);
-
+      if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
+	{
+	  int x = 1;
+	  orig_cond = TREE_OPERAND (t, 1);
+	  copy_tree_r (&orig_cond, &x, NULL);
+	}
+      else
+	{
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL, 
+				is_gimple_val, fb_rvalue);
+	  ret = MIN (ret, tret);
+	}
       /* Handle OMP_FOR_INCR.  */
       t = TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i);
       switch (TREE_CODE (t))
@@ -6713,6 +6742,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	    t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	    t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	    TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	    orig_incr = build_one_cst (TREE_TYPE (t));
 	    break;
 	  }
 
@@ -6726,6 +6756,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	  orig_incr = build_one_cst (TREE_TYPE (t));
 	  break;
 
 	case MODIFY_EXPR:
@@ -6753,8 +6784,16 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	      gcc_unreachable ();
 	    }
 
-	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-				is_gimple_val, fb_rvalue);
+	  orig_incr = TREE_OPERAND (t, 1);
+	  /* Right here we are just trying to extract the absolute
+	     value of the increment.  */
+	  if (TREE_CODE (t) == MINUS_EXPR
+	      || TREE_CODE  (TREE_OPERAND (t, 1)) == NEGATE_EXPR
+	      || (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST
+		  && tree_int_cst_sgn (TREE_OPERAND (t, 1)) < 1))
+	    orig_incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), orig_incr);
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body,
+				NULL, is_gimple_val, fb_rvalue);
 	  ret = MIN (ret, tret);
 	  if (c)
 	    {
@@ -6802,8 +6841,57 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 
   BITMAP_FREE (has_decl_expr);
 
+  tree incr_val = NULL_TREE, init_val = NULL_TREE, cond_val = NULL_TREE;
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      tree stmt_list = alloc_stmt_list ();
+      incr_val = create_tmp_var (TREE_TYPE (orig_incr), "__cilk_incr");
+      tree mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_incr), incr_val,
+			 orig_incr);
+      append_to_statement_list (mod, &stmt_list);
+
+      init_val = create_tmp_var (TREE_TYPE (orig_init), "__cilk_init");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_init), init_val, orig_init);
+      append_to_statement_list (mod, &stmt_list);
+
+      cond_val = create_tmp_var (TREE_TYPE (orig_cond), "__cilk_cond");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_cond), cond_val, orig_cond);
+      append_to_statement_list (mod, &stmt_list);
+  
+      gimplify_and_add (stmt_list, &for_pre_body);
+    }
   gimplify_and_add (OMP_FOR_BODY (orig_for_stmt), &for_body);
+ 
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      /* Sometimes an assign is inserted before the OMP_FOR_BODY.  So,
+	 search and find the omp for body.  */
+      gimple for_body_stmt = NULL;
+      for (gimple_stmt_iterator gsi = gsi_start (for_body); !gsi_end_p (gsi);
+	   gsi_next (&gsi))
+	{
+	  for_body_stmt = gsi_stmt (gsi);
+	  if (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL)
+	    break;
+	}
+      gcc_assert (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL);
+      tree orig_clses = gimple_omp_parallel_clauses (for_body_stmt);
+      tree new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = init_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = cond_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
 
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = incr_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      gimple_omp_parallel_set_clauses (for_body_stmt, new_clause);
+    }
   if (orig_for_stmt != for_stmt)
     for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
       {
@@ -6825,6 +6913,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break;
     case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break;
     case CILK_SIMD: kind = GF_OMP_FOR_KIND_CILKSIMD; break;
+    case CILK_FOR: kind = GF_OMP_FOR_KIND_CILKFOR; break;
     case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break;
     default:
       gcc_unreachable ();
@@ -6859,6 +6948,9 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gimple_omp_for_set_incr (gfor, i, TREE_OPERAND (t, 1));
     }
 
+  if (kind == GF_OMP_FOR_KIND_CILKFOR) 
+    gimple_cilk_for_set_grain (grain, gfor);
+
   gimplify_seq_add_stmt (pre_p, gfor);
   if (ret != GS_ALL_DONE)
     return GS_ERROR;
@@ -7880,6 +7972,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	case OMP_FOR:
 	case OMP_SIMD:
 	case CILK_SIMD:
+	case CILK_FOR:
 	case OMP_DISTRIBUTE:
 	  ret = gimplify_omp_for (expr_p, pre_p);
 	  break;
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index f1ec1c6..0beaa2a
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "tree-nested.h"
 #include "tree-eh.h"
+#include "cilk.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -198,6 +199,12 @@ struct omp_for_data
   struct omp_for_data_loop *loops;
 };
 
+/* A structure with necessary elements from _Cilk_for statement.  This
+   struct. node is passed in to WALK_STMT_INFO->INFO.  */
+typedef struct cilk_for_information {
+  bool found;
+  tree induction_var;
+} cilk_for_info;
 
 static splay_tree all_contexts;
 static int taskreg_nesting_level;
@@ -314,6 +321,9 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
   fd->have_ordered = false;
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
   fd->chunk_size = NULL_TREE;
+  if (flag_enable_cilkplus 
+      && gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
+    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
   collapse_iter = NULL;
   collapse_count = NULL;
 
@@ -391,8 +401,10 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	case GT_EXPR:
 	  break;
 	case NE_EXPR:
-	  gcc_assert (gimple_omp_for_kind (for_stmt)
-		      == GF_OMP_FOR_KIND_CILKSIMD);
+	  gcc_assert ((gimple_omp_for_kind (for_stmt)
+		       == GF_OMP_FOR_KIND_CILKSIMD)
+		      || (gimple_omp_for_kind (for_stmt)
+			  == GF_OMP_FOR_KIND_CILKFOR));
 	  break;
 	case LE_EXPR:
 	  if (POINTER_TYPE_P (TREE_TYPE (loop->n2)))
@@ -897,7 +909,31 @@ use_pointer_for_field (tree decl, omp_context *shared_ctx)
 	 variable no longer really shared.  */
       if (shared_ctx->is_nested)
 	{
-	  omp_context *up;
+	  omp_context *up = shared_ctx->outer;
+
+	  /* If VAR is the induction variable of the outer _Cilk_for, then
+	     it needs to be passed as a value not pointer since it
+	     would not be overwritten by the body.  */
+	  if (flag_enable_cilkplus
+	      && gimple_code (up->stmt) == GIMPLE_OMP_FOR
+	      && gimple_omp_for_kind (up->stmt) == GF_OMP_FOR_KIND_CILKFOR) 
+	    while (up) 
+	      { 
+		if (gimple_code (up->stmt) == GIMPLE_OMP_FOR
+		    && gimple_omp_for_kind (up->stmt)
+		    == GF_OMP_FOR_KIND_CILKFOR)
+		  {
+		    struct omp_for_data fd;
+		    /* _Cilk_for always has collapse = 1.  */
+		    struct omp_for_data_loop *loops
+		      = (struct omp_for_data_loop *)
+		      alloca (sizeof (struct omp_for_data_loop));
+		    extract_omp_for_data (up->stmt, &fd, loops);
+		    if (DECL_NAME (decl) == DECL_NAME (fd.loop.v))
+		      return false;
+		  }
+		up = up->outer;
+	      }
 
 	  for (up = shared_ctx->outer; up; up = up->outer)
 	    if (is_taskreg_ctx (up) && maybe_lookup_decl (decl, up))
@@ -1818,27 +1854,112 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	scan_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx);
 }
 
-/* Create a new name for omp child function.  Returns an identifier.  */
+/* Create a new name for omp child function.  Returns an identifier.  If 
+   IS_CILK_FOR is true then the suffix for the child function is 
+   "_cilk_for_fn."  */
 
 static tree
-create_omp_child_function_name (bool task_copy)
+create_omp_child_function_name (bool task_copy, bool is_cilk_for)
 {
+  if (is_cilk_for)
+    return clone_function_name (current_function_decl, "_cilk_for_fn");
   return (clone_function_name (current_function_decl,
 			       task_copy ? "_omp_cpyfn" : "_omp_fn"));
 }
 
+/* Helper function for walk_gimple_seq function.  *GSI_P is the gimple stmt.
+   iterator passed by walk_gimple_seq and *WI->INFO holds the CILK_FOR_INFO
+   structure.  This function sets the values inside this structure if it
+   finds a _Cilk_for statement in *GSI_P.  HANDLED_OPS_P is unused.  */
+
+static tree
+find_cilk_for_stmt (gimple_stmt_iterator *gsi_p,
+		    bool *handled_ops_p ATTRIBUTE_UNUSED,
+		    struct walk_stmt_info *wi)
+{
+  cilk_for_info *cf_info = (cilk_for_info *) wi->info;
+  gimple stmt = gsi_stmt (*gsi_p);
+
+  if (gimple_code (stmt) == GIMPLE_OMP_FOR
+      && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_CILKFOR)
+      /* For nested _Cilk_for statments, just look into the
+	 outer-most one.  */
+      && cf_info->found == false)
+    {
+      cf_info->found = true;
+      cf_info->induction_var = gimple_cilk_for_induction_var (stmt);
+    }
+  return NULL_TREE;
+}
+
+/* Returns true if STMT contains a CILK_FOR statement.  If found then
+   set *IND_VAR with induction variable.  Otherwise these values remain 
+   untouched.  IND_VAR can be NULL and if so then it is left untouched.  */
+
+static bool
+is_cilk_for_stmt (gimple stmt, tree *ind_var)
+{
+  if (!flag_enable_cilkplus)
+    return false;
+    
+  gimple_seq body = stmt;
+  struct walk_stmt_info wi;
+  cilk_for_info cf_info;
+  memset (&cf_info, 0, sizeof (cilk_for_info));
+  memset (&wi, 0, sizeof (wi));
+  wi.info = &cf_info;
+  walk_gimple_seq (body, find_cilk_for_stmt, NULL, &wi);
+  if (cf_info.found)
+    {
+      if (ind_var)
+	*ind_var = cf_info.induction_var;
+      return true;
+    }
+    
+  return false;
+}
+
+/* Returns the type of the induction variable for the child function for
+   _Cilk_for and the types for _high and _low variables based on TYPE.  */
+
+static tree
+cilk_for_check_loop_diff_type (tree type)
+{
+  if (type == integer_type_node)
+    return type;
+  else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node))
+    { 
+      if (TYPE_UNSIGNED (type)) 
+	return uint32_type_node;
+      else
+	return integer_type_node;
+    }
+  else
+    {
+      if (TYPE_UNSIGNED (type)) 
+	return uint64_type_node;
+      else
+	return long_long_integer_type_node;
+    }
+  gcc_unreachable ();
+}
+
 /* Build a decl for the omp child function.  It'll not contain a body
    yet, just the bare decl.  */
 
 static void
-create_omp_child_function (omp_context *ctx, bool task_copy)
+create_omp_child_function (omp_context *ctx, bool task_copy,
+			   bool is_cilk_for, tree cilk_var_type)
 {
   tree decl, type, name, t;
-
-  name = create_omp_child_function_name (task_copy);
+ 
+  name = create_omp_child_function_name (task_copy, is_cilk_for);
   if (task_copy)
     type = build_function_type_list (void_type_node, ptr_type_node,
 				     ptr_type_node, NULL_TREE);
+  else if (is_cilk_for)
+    type = build_function_type_list (void_type_node, ptr_type_node,
+				     cilk_var_type, cilk_var_type, NULL_TREE);
   else
     type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
 
@@ -1888,6 +2009,33 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_CONTEXT (t) = decl;
   DECL_RESULT (decl) = t;
 
+  /* _Cilk_for's child function requires two extra parameters called 
+     __low and __high that are set the by Cilk runtime when it calls this 
+     function.  */
+  if (is_cilk_for)
+    {
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__high"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__low"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+    }
   t = build_decl (DECL_SOURCE_LOCATION (decl),
 		  PARM_DECL, get_identifier (".omp_data_i"), ptr_type_node);
   DECL_ARTIFICIAL (t) = 1;
@@ -1895,6 +2043,8 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_ARG_TYPE (t) = ptr_type_node;
   DECL_CONTEXT (t) = current_function_decl;
   TREE_USED (t) = 1;
+  if (is_cilk_for)
+    DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
   DECL_ARGUMENTS (decl) = t;
   if (!task_copy)
     ctx->receiver_decl = t;
@@ -2016,7 +2166,15 @@ scan_omp_parallel (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+
+  tree ind_var = NULL_TREE;
+  bool is_cilk_for = (flag_enable_cilkplus && outer_ctx
+		      && is_cilk_for_stmt (outer_ctx->stmt, &ind_var));
+  tree cilk_var_type =
+    (is_cilk_for ? cilk_for_check_loop_diff_type (TREE_TYPE (ind_var))
+     : NULL_TREE);
+
+  create_omp_child_function (ctx, false, is_cilk_for, cilk_var_type);
   gimple_omp_parallel_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_parallel_clauses (stmt), ctx);
@@ -2061,7 +2219,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+  create_omp_child_function (ctx, false, false, NULL_TREE);
   gimple_omp_task_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_task_clauses (stmt), ctx);
@@ -2074,7 +2232,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
       DECL_ARTIFICIAL (name) = 1;
       DECL_NAMELESS (name) = 1;
       TYPE_NAME (ctx->srecord_type) = name;
-      create_omp_child_function (ctx, true);
+      create_omp_child_function (ctx, true, false, NULL_TREE);
     }
 
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
@@ -2199,7 +2357,7 @@ scan_omp_target (gimple stmt, omp_context *outer_ctx)
   TYPE_NAME (ctx->record_type) = name;
   if (kind == GF_OMP_TARGET_KIND_REGION)
     {
-      create_omp_child_function (ctx, false);
+      create_omp_child_function (ctx, false, false, NULL_TREE);
       gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
     }
 
@@ -2993,6 +3151,15 @@ lower_rec_simd_input_clauses (tree new_var, omp_context *ctx, int &max_vf,
   return true;
 }
 
+/* Returns true if the variable name in DECL matches *NAME.  */
+
+static inline bool
+is_cilk_loop_var (tree decl, const char *name)
+{
+  return (DECL_NAME (decl) && !strncmp (IDENTIFIER_POINTER (DECL_NAME (decl)), 
+					name, strlen (name))); 
+}
+
 /* Generate code to implement the input clauses, FIRSTPRIVATE and COPYIN,
    from the receiver (aka child) side and initializers for REFERENCE_TYPE
    private variables.  Initialization statements go in ILIST, while calls
@@ -3245,6 +3412,18 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 	      SET_DECL_VALUE_EXPR (new_var, x);
 	      DECL_HAS_VALUE_EXPR_P (new_var) = 1;
 
+	      /* In _Cilk_for, the increment, start and final values
+		 are stored in the clause inserted by gimplify_omp_for.  
+		 This value is used by the child function to find the 
+		 appropriate induction value function based on the 
+		 high and low parameters of the child function.  
+		 Now, we need to store the decl value expressions here so 
+		 that we can easily access them.  */
+	      if (flag_enable_cilkplus 
+		  && (is_cilk_loop_var (var, "__cilk_init") 
+		      || is_cilk_loop_var (var, "__cilk_cond")
+		      || is_cilk_loop_var (var, "__cilk_incr"))) 
+		SET_DECL_VALUE_EXPR (var, x);
 	      /* ??? If VAR is not passed by reference, and the variable
 		 hasn't been initialized yet, then we'll get a warning for
 		 the store into the omp_data_s structure.  Ideally, we'd be
@@ -4628,6 +4807,250 @@ expand_omp_build_assign (gimple_stmt_iterator *gsi_p, tree to, tree from)
     }
 }
 
+/* Returns true if T is a tree whose code is COMPONENT_REF and its field
+   matches D_F_NAME and the data argument matches D_ARG_NAME.  */
+
+static bool
+cilk_find_field_value (tree t, tree d_arg_name, tree d_f_name)
+{
+  if (TREE_CODE (t) == COMPONENT_REF)
+    {
+      tree arg = TREE_OPERAND (t, 0);
+      tree field = TREE_OPERAND (t, 1);
+      if (TREE_CODE (arg) == ADDR_EXPR || TREE_CODE (arg) == MEM_REF)
+	arg = TREE_OPERAND (arg, 0);
+      if (DECL_NAME (arg) && DECL_NAME (field)
+	  && !strcmp (IDENTIFIER_POINTER (d_arg_name),
+		      IDENTIFIER_POINTER (DECL_NAME (arg)))
+	  && !strcmp (IDENTIFIER_POINTER (d_f_name),
+		      IDENTIFIER_POINTER (DECL_NAME (field)))) 
+	return true;
+    }
+  return false;
+}
+
+/* Find the COMPONENT_REF in all the basic blocks in REGION whose 
+   data-argument is DATA_ARG and field is FIELD and then replace that 
+   COMPONENT_REF value with NEW_VALUE, a VAR_DECL.  */
+
+static void
+cilk_for_find_component_expr (struct omp_region *region, tree data_arg,
+			      tree field, tree new_value)
+{
+  vec<basic_block> bbs;
+  basic_block bb;
+  unsigned ii;
+  tree new_val = NULL_TREE;
+  bbs.create (0);
+  gather_blocks_in_sese_region (region->entry, region->exit, &bbs);
+  /* No need to push the entry bb into BBS since it doesn't get inserted
+     into the child function.  */
+  
+  tree da_name = DECL_NAME (data_arg);
+  tree df_name = DECL_NAME (field);
+  FOR_EACH_VEC_ELT (bbs, ii, bb)    
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+	 gsi_next (&gsi))
+      {
+	gimple stmt = gsi_stmt (gsi);
+	if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	  for (unsigned jj = 1; jj < gimple_num_ops (stmt); jj++)
+	    {
+	      tree *op = gimple_op_ptr (stmt, jj);
+	      if (TREE_CODE (*op) == COMPONENT_REF
+		  && cilk_find_field_value (*op, da_name, df_name))
+		{    
+		  if (TREE_TYPE (*op) == TREE_TYPE (new_value))
+		    new_val = new_value;
+		  else
+		    {
+		      tree t = fold_convert (TREE_TYPE (*op), new_value);
+		      new_val =
+			force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+						  true, GSI_NEW_STMT);
+		    }
+		  gsi_insert_before (&gsi, gimple_build_assign (*op, new_val), 
+				     GSI_NEW_STMT);
+		  *op = new_val;
+		}
+	    }
+      }
+}
+
+/* Find the initial, final and increment values in BODY_STMT's clause
+   and store them in *INIT, *FINAL and *INCR parameters respectively.  */
+
+static void
+find_cilk_for_vars (gimple body_stmt, tree *init, tree *final, tree *incr)
+{
+  /* Initial, final and increment value all start with __cilk_init,
+     __cilk_cond and __cilk_incr, respectively.  These values are defined
+     in shared clause.  Thus, we search for those.  */
+  for (tree cc = gimple_omp_parallel_clauses (body_stmt); cc; 
+       cc = OMP_CLAUSE_CHAIN (cc))
+    if (OMP_CLAUSE_CODE (cc) == OMP_CLAUSE_SHARED)
+      {
+	tree decl = OMP_CLAUSE_DECL (cc);
+	if (is_cilk_loop_var (decl, "__cilk_incr"))
+	  { 
+	    *incr = decl;
+	    if (DECL_VALUE_EXPR (*incr))
+	      *incr = DECL_VALUE_EXPR (*incr);
+	  } 
+	else if (is_cilk_loop_var (decl, "__cilk_init"))
+	  { 
+	    *init = decl;
+	    if (DECL_VALUE_EXPR (*init))
+	      *init = DECL_VALUE_EXPR (*init);
+	  }
+	else if (is_cilk_loop_var (decl, "__cilk_cond"))
+	  { 
+	    *final = decl;
+	    if (DECL_VALUE_EXPR (*final))
+	      *final = DECL_VALUE_EXPR (*final);
+	  }
+      }
+}
+ 
+/* Expand the _Cilk_for body starting at REGION.  DATA_ARG, HIGH and LOW 
+   indicates data-argument, __high and __low parameters of the child 
+   function.  */
+
+static void
+expand_cilk_for_body (struct omp_region *region, tree data_arg,
+		      tree low, tree high)
+{
+  struct omp_for_data fd;
+  struct omp_for_data_loop *loops;
+  loops
+    = (struct omp_for_data_loop *)
+      alloca (gimple_omp_for_collapse (last_stmt (region->outer->entry))
+	      * sizeof (struct omp_for_data_loop));
+  extract_omp_for_data (last_stmt (region->outer->entry), &fd, loops);
+  region->sched_kind = fd.sched_kind;
+  basic_block entry_bb = region->entry;
+  
+  /* This is where the body is and the location where we must insert
+     the modification to the induction variable.  */
+  basic_block body_bb = single_succ (region->entry);
+  gimple entry_stmt = last_stmt (region->entry);
+  
+  /* Split the first basic block into two and put the initializer values
+     in the top one.  */
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  basic_block l1_bb = split_block (entry_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (l1_bb);
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd.loop.v));
+  tree ind_var = create_tmp_reg (type, "__cilk_ind_var");
+  tree t = fold_convert (type, low);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_NEW_STMT);
+  gimple stmt = gimple_build_assign (ind_var, fold_convert (type, t));
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  vec_alloc (region->ws_args, 2);
+  tree t1 = null_pointer_node;
+  tree t2 = build_fold_addr_expr (gimple_omp_parallel_child_fn (entry_stmt));
+  if (data_arg)
+    {
+      t1 = build_fold_addr_expr (gimple_omp_parallel_data_arg (entry_stmt));
+      gsi = gsi_start_bb (body_bb);
+      tree init = NULL_TREE, final_val = NULL_TREE, incr = NULL_TREE;
+      find_cilk_for_vars (entry_stmt, &init, &final_val, &incr);
+
+      tree step = fd.loop.step;
+      if (TREE_CODE (fd.loop.step) != INTEGER_CST)
+	step = incr;      
+      step = fold_convert (type, step);
+      if (TREE_CODE (step) == INTEGER_CST && tree_int_cst_sgn (step) < 1)
+	step = fold_build1_loc (UNKNOWN_LOCATION, NEGATE_EXPR, type, step);
+      
+      tree tmp = create_tmp_reg (type, NULL);
+      gsi_insert_before (&gsi, gimple_build_assign (tmp, step),
+			 GSI_NEW_STMT);
+      t = build2 (MULT_EXPR, type, ind_var, tmp);
+      tree tmp2 = create_tmp_reg (type, NULL);
+      gsi_insert_after (&gsi, gimple_build_assign (tmp2, t), GSI_NEW_STMT);
+
+      tmp = create_tmp_reg (type, NULL);
+      init = fold_convert (type, init);
+      tree init_tmp = force_gimple_operand_gsi
+	(&gsi, init, true, NULL_TREE, false, GSI_CONTINUE_LINKING); 
+
+      gsi_insert_after (&gsi, gimple_build_assign (tmp, init_tmp), 
+			GSI_NEW_STMT);
+      if (fd.loop.cond_code == GE_EXPR || fd.loop.cond_code == GT_EXPR) 
+	t = fold_build2 (MINUS_EXPR, type, tmp, tmp2);
+      else 
+	t = fold_build2 (PLUS_EXPR, type, tmp, tmp2);
+
+      t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false, 
+				    GSI_CONTINUE_LINKING);
+      tree tmp3 = create_tmp_reg (type, NULL);
+      gimple stmt = gimple_build_assign (tmp3, t);
+      gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+      cilk_for_find_component_expr (region, data_arg, fd.loop.v, tmp3);
+    }
+  region->ws_args->quick_push (t1);
+  region->ws_args->quick_push (t2);
+  
+  gsi = gsi_last_bb (l1_bb);
+  basic_block cond_bb = split_block (l1_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (l1_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (cond_bb);
+  t = fold_convert (type, high);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_CONTINUE_LINKING);
+  t = build2 (LT_EXPR, boolean_type_node, ind_var, t);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p,
+		 NULL, NULL)
+      || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p,
+		    NULL, NULL))
+    {
+      gsi = gsi_for_stmt (stmt);
+      gimple_regimplify_operands (stmt, &gsi);
+    }
+
+  /* Insert incrementing of induction variable.  */
+  gsi = gsi_last_bb (body_bb);
+  t = build2 (PLUS_EXPR, type, ind_var, build_one_cst (type));
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+  gsi_insert_after (&gsi, gimple_build_assign (ind_var, t),
+		    GSI_CONTINUE_LINKING);
+  
+  basic_block exit_bb = region->exit;
+
+  gsi = gsi_last_bb (exit_bb);
+  basic_block last_bb = split_block (exit_bb, gsi_stmt (gsi))->dest;
+  
+  /* Remove the #pragma omp return.  */
+  gsi = gsi_last_bb (exit_bb);
+  gsi_remove (&gsi, true);
+  
+  gsi = gsi_last_bb (last_bb);
+  gsi_insert_before (&gsi, gimple_build_return (NULL), GSI_SAME_STMT);
+  
+  /* Now connect all the basic-blocks.  */
+  edge e = make_edge (cond_bb, last_bb, EDGE_FALSE_VALUE);
+  e->probability = REG_BR_PROB_BASE / 4;
+
+  edge e3 = find_edge (cond_bb, body_bb);
+  e3->probability = REG_BR_PROB_BASE * 3 / 4;
+  e3->flags = EDGE_TRUE_VALUE;
+  
+  edge e2 = find_edge (exit_bb, last_bb);
+  remove_edge (e2);
+  e2 = make_edge (exit_bb, cond_bb, EDGE_FALLTHRU);
+  e2->probability = 1;
+  region->exit = last_bb;
+}
+
 /* Expand the OpenMP parallel or task directive starting at REGION.  */
 
 static void
@@ -4640,6 +5063,7 @@ expand_omp_taskreg (struct omp_region *region)
   gimple entry_stmt, stmt;
   edge e;
   vec<tree, va_gc> *ws_args;
+  gimple parcopy_stmt = NULL;
 
   entry_stmt = last_stmt (region->entry);
   child_fn = gimple_omp_taskreg_child_fn (entry_stmt);
@@ -4648,6 +5072,16 @@ expand_omp_taskreg (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
+  /* The way _Cilk_for is constructed in the compiler is like making
+     the _Cilk_for statment a #pragma OMP for and the body of it is
+     enclosed in #pragma omp parallel.  In this routine, we handle
+     inserting the body into the child function and putting a loop around
+     it to go from low to high.  NOTE: Even though this is how the 
+     compiler breaks them, they do NOT function the same way.  */
+  bool is_cilk_for =
+    (flag_enable_cilkplus && region->outer
+     && is_cilk_for_stmt (last_stmt (region->outer->entry), NULL));
+    
   if (is_combined_parallel (region))
     ws_args = region->ws_args;
   else
@@ -4698,7 +5132,6 @@ expand_omp_taskreg (struct omp_region *region)
 	  basic_block entry_succ_bb = single_succ (entry_bb);
 	  gimple_stmt_iterator gsi;
 	  tree arg, narg;
-	  gimple parcopy_stmt = NULL;
 
 	  for (gsi = gsi_start_bb (entry_succ_bb); ; gsi_next (&gsi))
 	    {
@@ -4755,6 +5188,29 @@ expand_omp_taskreg (struct omp_region *region)
 	    }
 	}
 
+      /* Extract the __high and __low parameter from the function.  */
+      tree high_arg = NULL_TREE, low_arg = NULL_TREE;
+      if (is_cilk_for)
+	{
+	  for (tree ii_arg = DECL_ARGUMENTS (child_fn); ii_arg != NULL_TREE;
+	       ii_arg = TREE_CHAIN (ii_arg))
+	    {
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)),
+			   "__high"))
+		high_arg = ii_arg;
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__low"))
+		low_arg = ii_arg;
+	    }
+	  gcc_assert (high_arg);
+	  gcc_assert (low_arg);
+	  expand_cilk_for_body (region, gimple_get_lhs (parcopy_stmt),
+				low_arg, high_arg);
+
+	  /* A new BB is added to the end of EXIT_BB and thus it needs to be
+	     updated.  */
+	  exit_bb = region->exit;
+	}
+
       /* Declare local variables needed in CHILD_CFUN.  */
       block = DECL_INITIAL (child_fn);
       BLOCK_VARS (block) = vec2chain (child_cfun->local_decls);
@@ -4787,7 +5243,7 @@ expand_omp_taskreg (struct omp_region *region)
       single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
 
       /* Convert GIMPLE_OMP_RETURN into a RETURN_EXPR.  */
-      if (exit_bb)
+      if (exit_bb && !is_cilk_for)
 	{
 	  gsi = gsi_last_bb (exit_bb);
 	  gcc_assert (!gsi_end_p (gsi)
@@ -4861,11 +5317,16 @@ expand_omp_taskreg (struct omp_region *region)
       pop_cfun ();
     }
 
-  /* Emit a library call to launch the children threads.  */
-  if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
-    expand_parallel_call (region, new_bb, entry_stmt, ws_args);
-  else
-    expand_task_call (new_bb, entry_stmt);
+  /* In _Cilk_for, the call to the runtime function is inserted by
+     expand_omp_for.  */
+  if (!is_cilk_for)
+    {
+      /* Emit a library call to launch the children threads.  */
+      if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
+	expand_parallel_call (region, new_bb, entry_stmt, ws_args);
+      else
+	expand_task_call (new_bb, entry_stmt);
+    }
   if (gimple_in_ssa_p (cfun))
     update_ssa (TODO_update_ssa_only_virtuals);
 }
@@ -6540,6 +7001,122 @@ expand_omp_for_static_chunk (struct omp_region *region,
     }
 }
 
+/* A subroutine of expand_omp_for.  Insert the function call to the
+   cilk library function-call: __cilkrts_cilk_for_64/32 into the end of
+   REGION.  Loop information is calculated using step, n1 and n2 from FD.  */
+
+static void
+insert_cilk_for_fn_call (struct omp_region *region, struct omp_for_data *fd)
+{
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  basic_block entry_bb = region->entry;
+  bool broken_loop = region->cont == NULL;
+  basic_block cont_bb = region->cont;
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  tree diff_type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  tree grain = gimple_cilk_for_grain (fd->for_stmt);
+  
+  /* Convert n2 and n1 to the type we need.  */
+  tree n1 = fold_convert (diff_type, fd->loop.n1);
+  tree n2 = fold_convert (diff_type, fd->loop.n2);
+
+  n1 = force_gimple_operand_gsi (&gsi, n1, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  n2 = force_gimple_operand_gsi (&gsi, n2, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  tree diff_val = fold_build2 (MINUS_EXPR, diff_type, n2, n1);
+
+  diff_val = force_gimple_operand_gsi (&gsi, diff_val, true, NULL_TREE,
+					    true, GSI_SAME_STMT);
+  tree step = fd->loop.step;
+  tree step_convert = force_gimple_operand_gsi (&gsi,
+						fold_convert (diff_type, step),
+						true, NULL_TREE, true,
+						GSI_SAME_STMT);
+  tree count = fold_build2 (TRUNC_DIV_EXPR, diff_type, diff_val, step_convert);
+  count = force_gimple_operand_gsi (&gsi, count, true, NULL_TREE, true,
+				    GSI_SAME_STMT);
+
+  tree data_arg_ptr = (*region->ws_args)[0];
+  tree child_fn = (*region->ws_args)[1];
+
+  tree lib_fun = NULL_TREE;
+  if (TYPE_PRECISION (type) == 32)
+    lib_fun = cilk_for_32_fndecl;
+  else if (TYPE_PRECISION (type) == 64)
+    lib_fun = cilk_for_64_fndecl;
+  else
+    gcc_unreachable ();
+
+  vec<tree, va_gc> *args;
+  vec_alloc (args, 4);
+  args->quick_push (child_fn);
+  args->quick_push (data_arg_ptr);
+  args->quick_push (count);
+  args->quick_push (grain);
+  tree t = build_call_expr_loc_vec (UNKNOWN_LOCATION, lib_fun, args);
+  gsi_remove (&gsi, true);
+
+  if (!broken_loop)
+    {
+      /* Code to control the increment goes in the CONT_BB.  */
+      gsi = gsi_last_bb (cont_bb);
+      gimple stmt = gsi_stmt (gsi);
+      gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE);
+      gsi_remove (&gsi, true);
+      
+      /* remove the edge to OMP continue block.  */
+      unsigned int ii = 0;
+      while (EDGE_COUNT (cont_bb->succs) > 1)
+	{
+	  edge ee = EDGE_SUCC (cont_bb, ii);
+	  if (!(ee->flags & EDGE_FALLTHRU))
+	    remove_edge (ee);
+	  ii++;
+	}      
+      gsi = gsi_start_bb (cont_bb);
+      gsi_remove (&gsi, true);
+      force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+    }
+
+  /* Remove GIMPLE_OMP_RETURN.  */
+  gsi = gsi_last_bb (region->exit);
+  gimple stmt = gsi_stmt (gsi);
+  gcc_assert (gimple_code (stmt) == GIMPLE_OMP_RETURN);
+  gsi_remove (&gsi, true);
+
+  gsi = gsi_last_bb (region->entry);
+  t = fold_build2 (fd->loop.cond_code, boolean_type_node, n1, n2);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  /* In here we are replacing a _Cilk_for statement with something
+     like this:
+
+     if (n1 <cond_code> n2)
+       goto bb1
+     else
+       goto bb2
+     
+     bb1:
+       .omp_data.o.__cilk_incr = __cilk_incr;
+       ...
+       __cilkrts_cilk_for_{32/64} (func_name, &omp_data_0, <count>, <grain>);
+
+     bb2:
+     clobber all values and go out.  */  
+  unsigned int ii = 0;
+  while (ii < EDGE_COUNT (region->entry->succs))
+    {
+      edge ee = EDGE_SUCC (region->entry, ii);
+      if (ee->flags & EDGE_FALLTHRU)
+	ee->flags = EDGE_TRUE_VALUE;
+      else
+	ee->flags = EDGE_FALSE_VALUE;
+      ii++;
+    }
+}
 
 /* A subroutine of expand_omp_for.  Generate code for a simd non-worksharing
    loop.  Given parameters:
@@ -6880,6 +7457,12 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt)
 
   if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_SIMD)
     expand_omp_simd (region, &fd);
+  else if (flag_enable_cilkplus 
+	   && (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_CILKFOR))
+    {
+      region->ws_args = region->inner->ws_args;
+      insert_cilk_for_fn_call (region, &fd);
+    }
   else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
 	   && !fd.have_ordered)
     {
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
new file mode 100644
index 0000000..a80f413
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
@@ -0,0 +1,100 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+static void check (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start;  ii < end; ii = ii + incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+static void check_reverse (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start; ii >= end; ii = ii - incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+
+int main (void)
+{
+  int Array[10];
+  int x = 9, y = 0, z = 3;
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 1133;
+  check (Array, 0, 10, 1, 1133);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 9; ii > -1; ii--)
+    Array[ii] = 4433;
+  check_reverse (Array, 9, 0, 1, 4433);
+
+  _Cilk_for (int ii = 9; ii > -1; --ii)
+    Array[ii] = 9988;
+  check_reverse (Array, 9, 0, 1, 9988);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    Array[ii] = 1328;
+  check (Array, 0, 10, 2, 1328);
+
+  _Cilk_for (int ii = 9; ii >= 0; ii -= 2)
+    Array[ii] = 1738;
+  check_reverse (Array, 9, 0, 2, 1738);
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if (ii % 2)
+	Array[ii] = 1343;
+      else
+	Array[ii] = 3413;
+    }
+
+  check (Array, 1, 10, 2, 1343); 
+  check (Array, 0, 10, 2, 3413); 
+
+  _Cilk_for (short cc = 0; cc < 10; cc++) 
+    Array[cc] = 1343;
+  check (Array, 0, 10,  1,1343);
+
+  _Cilk_for (short cc = 9; cc >= 0; cc--)
+    Array[cc] = 1348;
+  check_reverse (Array, 9, 0, 1, 1348);
+
+
+
+  /* Loop with polynomials in _Cilk_for.  */
+  _Cilk_for (int ii = z - 3; ii <= z * 3; ii += 2)
+    { 
+      Array[ii] = 3233;
+    }
+
+  for (int ii = z-3; ii <= z*3; ii += 2)
+    if (Array[ii] != 3233)
+      __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
new file mode 100644
index 0000000..0ebc09a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+int main (void)
+{
+  int q = 0, ii = 0, jj = 0;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+
+  _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 5;
+
+  _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
+    q = 5;
+
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static variable" } */
+    q = 5;
+
+  _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */
+    q = 5;
+
+  _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
+    q = 5;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
new file mode 100644
index 0000000..6cb9b03
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
@@ -0,0 +1,35 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int grain_value = 2;
+int main (void)
+{
+  int Array1[200], Array1_Serial[200];
+
+  for (int ii = 0; ii < 200; ii++)
+    {
+      Array1_Serial[ii] = 2;
+      Array1[ii] = 1;
+    }
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 200; ii++)
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+#pragma cilk grainsize = grain_value
+  _Cilk_for (int ii = 0; ii < 200; ii++) 
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
new file mode 100644
index 0000000..ff8bc0a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+char Array1[26];
+
+#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */
+
+int main(int argc, char **argv)
+{
+/* This is OK.  */
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+
+/* This is OK, it will do a type conversion to long int.  */
+#pragma cilk grainsize = 0.5 
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize = 1 
+  while (Array1[5] != 0) /* { dg-warning "grainsize pragma is not followed" } */
+    {
+    /* Blah */
+    }
+
+#pragma cilk grainsize = 1 
+  int q = 0; /* { dg-warning "grainsize pragma is not followed" } */
+  _Cilk_for (q = 0; q < 10; q++)
+    Array1[q]  = 5;
+
+  while (Array1[5] != 0)
+    {
+    /* Blah */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
new file mode 100644
index 0000000..7a779f7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
@@ -0,0 +1,41 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+/* <feature> loop control variable must have integer, pointer or class type
+   </feature>
+*/
+
+#define ARRAY_SIZE 10000
+int a[ARRAY_SIZE];
+
+int main(void)
+{ 
+  int ii = 0;
+
+#if 1
+  for (ii =0; ii < ARRAY_SIZE; ii++)
+    a[ii] = 5;
+#endif
+  _Cilk_for(int *aa = a; aa < a + ARRAY_SIZE; aa++) 
+    *aa = 0;
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii++) 
+    if (a[ii] != 0) 
+      __builtin_abort ();
+#endif
+
+  _Cilk_for (int *aa = a; aa < a + ARRAY_SIZE; aa = aa + 2)
+    *aa = 4;
+
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii = ii + 2) 
+    if (a[ii] != 4) 
+      __builtin_abort ();
+#endif
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
new file mode 100644
index 0000000..cffe17e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
@@ -0,0 +1,79 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+int main (void)
+{
+  int Array[10][10];
+
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj++)
+	{
+	  Array[ii][jj] = 0;
+	}
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 5; jj++)
+      Array[ii][jj] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 5; jj++)
+      if (Array[ii][jj] != 5)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+
+  /* One goes up and one goes down.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 9; jj >= 0; jj--)
+      Array[ii][jj] = 7;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 9; jj >= 0; jj--)
+      if (Array[ii][jj] != 7)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 10; jj += 2)
+      Array[ii][jj] = 9;
+  
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj += 2)
+      if (Array[ii][jj] != 9)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    _Cilk_for (int jj = 5; jj < 9; jj++)
+      Array[ii][jj] = 11; 
+  
+  for (int ii = 0; ii < 10; ii += 2)
+    for (int jj = 5; jj < 9; jj++)
+      if (Array[ii][jj] != 11)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 0a41b86..988408a 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -351,6 +351,7 @@ enum omp_clause_schedule_kind {
   OMP_CLAUSE_SCHEDULE_GUIDED,
   OMP_CLAUSE_SCHEDULE_AUTO,
   OMP_CLAUSE_SCHEDULE_RUNTIME,
+  OMP_CLAUSE_SCHEDULE_CILKFOR,
   OMP_CLAUSE_SCHEDULE_LAST
 };
 
diff --git a/gcc/tree.def b/gcc/tree.def
index f8d6444..558d7c8 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1051,6 +1051,10 @@ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6)
    Operands like for OMP_FOR.  */
 DEFTREECODE (CILK_SIMD, "cilk_simd", tcc_statement, 6)
 
+/* Cilk Plus - _Cilk_for (..)
+   Operands like for OMP_FOR.  */
+DEFTREECODE (CILK_FOR, "cilk_for", tcc_statement, 6)
+
 /* OpenMP - #pragma omp distribute [clause1 ... clauseN]
    Operands like for OMP_FOR.  */
 DEFTREECODE (OMP_DISTRIBUTE, "omp_distribute", tcc_statement, 6)

[-- Attachment #3: diff_c++.txt --]
[-- Type: text/plain, Size: 20676 bytes --]

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c99c1fc..963c28a
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -237,8 +237,8 @@ static void cp_parser_initial_pragma
 static tree cp_literal_operator_id
   (const char *);
 
-static void cp_parser_cilk_simd
-  (cp_parser *, cp_token *);
+static tree cp_parser_cilk_simd
+  (cp_parser *, cp_token *, tree);
 static bool cp_parser_omp_declare_reduction_exprs
   (tree, cp_parser *);
 
@@ -9364,6 +9364,18 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 	  statement = cp_parser_iteration_statement (parser, false);
 	  break;
 
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (cp_lexer_peek_token (parser->lexer)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      cp_lexer_consume_token (parser->lexer);
+	      statement = error_mark_node;
+	    }
+	  else
+	    statement = cp_parser_cilk_simd (parser, NULL, integer_zero_node);
+	  break;
+
 	case RID_BREAK:
 	case RID_CONTINUE:
 	case RID_RETURN:
@@ -28694,7 +28706,7 @@ cp_parser_omp_for_cond (cp_parser *parser, tree decl, enum tree_code code)
     case LE_EXPR:
       break;
     case NE_EXPR:
-      if (code == CILK_SIMD)
+      if (code == CILK_SIMD || code == CILK_FOR)
 	break;
       /* Fall through: OpenMP disallows NE_EXPR.  */
     default:
@@ -29019,11 +29031,18 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       bool add_private_clause = false;
       location_t loc;
 
-      if (!cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
+      if (code == CILK_SIMD
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
 	{
 	  cp_parser_error (parser, "for statement expected");
 	  return NULL;
 	}
+      if (code == CILK_FOR
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_CILK_FOR))
+	{
+	  cp_parser_error (parser, "_Cilk_for statement expected");
+	  return NULL;
+	}
       loc = cp_lexer_consume_token (parser->lexer)->location;
 
       if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
@@ -29032,13 +29051,26 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       init = decl = real_decl = NULL;
       this_pre_body = push_stmt_list ();
 
+      if (code == CILK_FOR
+	  && cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC))
+	{
+	  error_at (cp_lexer_peek_token (parser->lexer)->location,
+		    "induction variable cannot be static");
+	  cp_lexer_consume_token (parser->lexer);
+	}
       add_private_clause
 	|= cp_parser_omp_for_loop_init (parser,
-					/*parsing_openmp=*/code != CILK_SIMD,
+					/*parsing_openmp=*/
+					(code != CILK_SIMD && code != CILK_FOR),
 					this_pre_body, for_block,
 					init, decl, real_decl);
 
-      cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
+      if (!cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON)
+	  && code == CILK_FOR)
+	{
+	  cp_parser_skip_to_end_of_statement (parser);
+	  cp_parser_consume_semicolon_at_end_of_statement (parser);
+	}
       if (this_pre_body)
 	{
 	  this_pre_body = pop_stmt_list (this_pre_body);
@@ -29196,17 +29228,30 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
 
   /* Note that we saved the original contents of this flag when we entered
      the structured block, and so we don't need to re-save it here.  */
-  if (code == CILK_SIMD)
+  if (code == CILK_SIMD || code == CILK_FOR)
     parser->in_statement = IN_CILK_SIMD_FOR;
   else
     parser->in_statement = IN_OMP_FOR;
 
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = begin_omp_parallel ();
+    }
+
   /* Note that the grammar doesn't call for a structured block here,
      though the loop as a whole is a structured block.  */
   body = push_stmt_list ();
   cp_parser_statement (parser, NULL_TREE, false, NULL);
   body = pop_stmt_list (body);
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = finish_omp_parallel (NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
   if (declv == NULL_TREE)
     ret = NULL_TREE;
   else
@@ -31084,6 +31129,38 @@ cp_parser_initial_pragma (cp_token *first_token)
   cp_lexer_get_preprocessor_token (NULL, first_token);
 }
 
+/* Parses the grainsize pragma for the _Cilk_for statement.
+   Syntax:
+   #pragma cilk grainsize = <VALUE>.  */
+
+static void
+cp_parser_cilk_grainsize (cp_parser *parser, cp_token *pragma_tok)
+{
+  if (cp_parser_require (parser, CPP_EQ, RT_EQ))
+    {
+      tree exp = cp_parser_binary_expression (parser, false, false,
+                                              PREC_NOT_OPERATOR, NULL);
+      cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+      if (!exp || exp == error_mark_node)
+        {
+          error_at (pragma_tok->location, "invalid grainsize for _Cilk_for");
+          return;
+        }
+      cp_token *n_tok = cp_lexer_peek_token (parser->lexer);
+
+      /* Make sure the next token is _Cilk_for, it is invalid otherwise.  */
+      if (n_tok && n_tok->type == CPP_KEYWORD
+	  && n_tok->keyword == RID_CILK_FOR)
+	cp_parser_cilk_simd (parser, NULL, exp);
+      else
+	warning_at (cp_lexer_peek_token (parser->lexer)->location, 0,
+		    "%<#pragma cilk grainsize%> is not followed by "
+		    "%<_Cilk_for%>");
+      return;
+    }
+  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+}
+
 /* Normal parsing of a pragma token.  Here we can (and must) use the
    regular lexer.  */
 
@@ -31263,9 +31340,30 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
 		    "%<#pragma simd%> must be inside a function");
 	  break;
 	}
-      cp_parser_cilk_simd (parser, pragma_tok);
+      cp_parser_cilk_simd (parser, pragma_tok, NULL_TREE);
       return true;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (context == pragma_external)
+        {
+          error_at (pragma_tok->location,
+                    "%<#pragma cilk grainsize%> must be inside a function");
+          break;
+        }
+
+      /* Ignore the pragma if Cilk Plus is not enabled.  */
+      if (flag_enable_cilkplus)
+        {
+          cp_parser_cilk_grainsize (parser, pragma_tok);
+          return true;
+        }
+      else
+        {
+          error_at (pragma_tok->location, "-fcilkplus must be enabled to use "
+                    "%<#pragma cilk grainsize%>");
+          break;
+	}
+      
     default:
       gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
       c_invoke_pragma_handler (id);
@@ -31555,31 +31653,57 @@ cp_parser_cilk_simd_all_clauses (cp_parser *parser, cp_token *pragma_token)
     return c_finish_cilk_clauses (clauses);
 }
 
-/* Main entry-point for parsing Cilk Plus <#pragma simd> for loops.  */
+/* Main entry-point for parsing Cilk Plus <#pragma simd> for and _Cilk_for
+   loops.  This function returns NULL_TREE whenever it is parsing the
+   #pragma simd's for because the caller does not check the return value.
+   _Cilk_for's caller checks this value and thus return error_mark_node
+   when errors happen and a valid value when things go as expected.  */
 
-static void
-cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token)
+static tree
+cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token, tree grain)
 {
-  tree clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
+  bool is_cilk_for = !pragma_token ? true: false;
+  tree clauses = NULL_TREE;
+
+  if (!is_cilk_for)
+    clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
 
   if (clauses == error_mark_node)
-    return;
-  
-  if (cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
+    return NULL_TREE;
+
+  if (!is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
     {
       error_at (cp_lexer_peek_token (parser->lexer)->location,
 		"for statement expected");
-      return;
+      return NULL_TREE;
+    }
+  if (is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_CILK_FOR))
+    {
+      error_at (cp_lexer_peek_token (parser->lexer)->location,
+		"_Cilk_for statement expected");
+      return error_mark_node;
     }
 
   tree sb = begin_omp_structured_block ();
   int save = cp_parser_begin_omp_structured_block (parser);
-  tree ret = cp_parser_omp_for_loop (parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  tree ret = cp_parser_omp_for_loop (parser, code, clauses, NULL);
   if (ret)
     cpp_validate_cilk_plus_loop (OMP_FOR_BODY (ret));
+
+  /* For _Cilk_for statements, the grain value is stored in the same
+     location as clauses for OMP for.  */
+  if (is_cilk_for && ret)
+    OMP_FOR_CLAUSES (ret) = grain;
+
   cp_parser_end_omp_structured_block (parser, save);
-  add_stmt (finish_omp_structured_block (sb));
-  return;
+  tree stmt = finish_omp_structured_block (sb);
+  add_stmt (stmt);
+  if (is_cilk_for) 
+    return stmt;
+  return NULL_TREE;
 }
 
 /* Create an identifier for a generic parameter type (a synthesized
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 98d7365..463e4c3 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13575,6 +13575,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
     case OMP_FOR:
     case OMP_SIMD:
     case CILK_SIMD:
+    case CILK_FOR:
     case OMP_DISTRIBUTE:
       {
 	tree clauses, body, pre_body;
@@ -13582,8 +13583,11 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
 	tree incrv = NULL_TREE;
 	int i;
 
-	clauses = tsubst_omp_clauses (OMP_FOR_CLAUSES (t), false,
-				      args, complain, in_decl);
+	if (TREE_CODE (t) == CILK_FOR)
+	  clauses = RECUR (OMP_FOR_CLAUSES (t));
+	else
+	  clauses = tsubst_omp_clauses (OMP_FOR_CLAUSES (t), false,
+					args, complain, in_decl);
 	if (OMP_FOR_INIT (t) != NULL_TREE)
 	  {
 	    declv = make_tree_vec (TREE_VEC_LENGTH (OMP_FOR_INIT (t)));
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 0bb64c7..cc1a013 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -5965,7 +5965,8 @@ finish_omp_task (tree clauses, tree body)
 static bool
 handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
 			       tree condv, tree incrv, tree *body,
-			       tree *pre_body, tree clauses)
+			       tree *pre_body, tree clauses,
+			       bool is_cilk_for)
 {
   tree diff, iter_init, iter_incr = NULL, last;
   tree incr_var = NULL, orig_pre_body, orig_body, c;
@@ -5985,6 +5986,7 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
     case GE_EXPR:
     case LT_EXPR:
     case LE_EXPR:
+    case NE_EXPR:
       if (TREE_OPERAND (cond, 1) == iter)
 	cond = build2 (swap_tree_comparison (TREE_CODE (cond)),
 		       TREE_TYPE (cond), iter, TREE_OPERAND (cond, 0));
@@ -6128,6 +6130,11 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
       break;
 
   decl = create_temporary_var (TREE_TYPE (diff));
+  /* In _Cilk_for we must know the induction variable name since it is
+     read by expand_cilk_for_body in omp-low.c to set the induction
+     variable in the child function correctly.  */
+  if (is_cilk_for)
+    DECL_NAME (decl) = make_anon_name ();
   pushdecl (decl);
   add_decl_expr (decl);
   last = create_temporary_var (TREE_TYPE (diff));
@@ -6343,8 +6350,24 @@ finish_omp_for (location_t locus, enum tree_code code, tree declv, tree initv,
 				"iteration variable %qE", decl);
 	      return NULL;
 	    }
-	  if (handle_omp_for_class_iterator (i, locus, declv, initv, condv,
-					     incrv, &body, &pre_body, clauses))
+
+	  /* In _Cilk_for, all the iterator mapping code should be
+	     inserted in the OMP_PARALLEL_BODY.  */
+	  if (code == CILK_FOR)
+	    {
+	      tree the_body = OMP_PARALLEL_BODY (body);
+	      if (TREE_CODE (the_body) == BIND_EXPR)
+		the_body = BIND_EXPR_BODY (the_body);
+	      if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						 condv, incrv, &the_body,
+						 &pre_body, clauses, true))
+		return NULL;
+	      else
+		BIND_EXPR_BODY (OMP_PARALLEL_BODY (body)) = the_body;
+	    }
+	  else if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						  condv, incrv, &body,
+						  &pre_body, clauses, false))
 	    return NULL;
 	  continue;
 	}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
index 0ebc09a..ed73c34 100644
--- a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -7,7 +7,8 @@ int main (void)
 {
   int q = 0, ii = 0, jj = 0;
 
-  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */
+    /* { dg-error "expected" "" { target c++ } 10 } */
     q = 5;
 
   _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
@@ -16,24 +17,30 @@ int main (void)
   _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
     q = 2;
 
-  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" "" { target c } } */
+    /* { dg-error "invalid controlling predicate" "" { target c++ }  20 } */
     q = 5;
 
   _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
     q = 5;
 
-  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected" } */ 
     q = 5;
 
   _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
     q = 5;
 
-  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static variable" } */
+ 
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static" } */
+
     q = 5;
 
+
   _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
     q = 5;
 
+
   _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
     q = 5;
 
@@ -43,7 +50,9 @@ int main (void)
   _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
     q = 5;
 
-  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */ 
+    /* { dg-error "expected" "" { target c++ }  53 } */
     q = 5;
+
   return 0;
 }
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
index ff8bc0a..e1e3217 100644
--- a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -29,13 +29,13 @@ int main(int argc, char **argv)
     Array1[ii] = 0;
 
 #pragma cilk grainsize = 1 
-  while (Array1[5] != 0) /* { dg-warning "grainsize pragma is not followed" } */
+  while (Array1[5] != 0) /* { dg-warning "is not followed by" } */
     {
     /* Blah */
     }
 
 #pragma cilk grainsize = 1 
-  int q = 0; /* { dg-warning "grainsize pragma is not followed" } */
+  int q = 0; /* { dg-warning "is not followed by" } */
   _Cilk_for (q = 0; q < 10; q++)
     Array1[q]  = 5;
 
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
new file mode 100644
index 0000000..8221371
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
@@ -0,0 +1,25 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define SIZE 100
+#define CHECK_VALUE 5
+
+template <class T>
+int func (T start, T end)
+{
+  int Array[SIZE];
+  _Cilk_for (T ii = 0; ii < end; ii++)
+    Array[ii] = CHECK_VALUE;
+  
+  for (T ii = 0; ii < end; ii++)
+    if (Array[ii] != CHECK_VALUE)
+      __builtin_abort ();
+
+  return 0;
+}
+
+int main (void)
+{
+  return func <int> (0, 100) + func <long> (0, 100);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
new file mode 100644
index 0000000..2ac8c72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
@@ -0,0 +1,52 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array;
+vector <int> array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back (ii);
+}
+#endif
+_Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end();
+          iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+for (vector<int>::iterator iter = array_serial.begin(); 
+     iter != array_serial.end(); iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
new file mode 100644
index 0000000..1cf3301
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
@@ -0,0 +1,72 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array,array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back(ii);
+}
+#endif
+_Cilk_for (vector<int>::reverse_iterator iter4 = array.rbegin(); 
+	   iter4 != array.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+
+_Cilk_for (vector<int>::reverse_iterator iter4 = array_serial.rbegin(); 
+	   iter4 != array_serial.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+_Cilk_for (vector<int>::reverse_iterator iter2 = array.rbegin(); 
+	   iter2 != array.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+for (vector<int>::reverse_iterator iter2 = array_serial.rbegin(); 
+     iter2 != array_serial.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
+
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
new file mode 100644
index 0000000..3e350a1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
@@ -0,0 +1,50 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <iostream>
+#include <cstdio>
+#include <cstdlib>
+#include <vector>
+#include <algorithm>
+#include <list>
+
+using namespace std;
+
+
+int main(int argc, char **argv)
+{
+  vector <int> number_list, number_list_serial;
+  int new_number = 0;
+  int no_elements = 0;
+  
+  if (argc != 2)
+  {
+    no_elements = 10000;
+  }
+
+
+  number_list.clear();
+  number_list_serial.clear();
+  for (int ii = 0; ii < no_elements; ii++)
+  {
+    number_list.push_back(new_number);
+    number_list_serial.push_back(new_number);
+  }
+
+  _Cilk_for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list[jj] = jj + no_elements;
+  }
+  for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list_serial[jj] = jj + no_elements;
+  }
+
+  for (int jj = 0; jj < no_elements; jj++)
+    if (number_list_serial[jj] != number_list[jj])
+      abort ();
+
+  return 0;
+}

[-- Attachment #4: c-ChangeLog --]
[-- Type: application/octet-stream, Size: 4350 bytes --]

gcc/ChangeLog
2014-01-06  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* cilk-builtins.def: Added two new builtin functions called
	__cilkrts_cilk_for_32 and __cilkrts_cilk_for_64.
	* cilk-common.c (cilk_init_builtins): Likewise.
	(cilk_declare_looper): New function.
	* cilk.h (enum cilk_tree_index): Added two new fields called
	CILK_TI_F_LOOP_32 and CILK_TI_F_LOOP_64.
	(cilk_for_32_fndecl): New #define.
	(cilk_for_64_fndecl): Likewise.
	* gimple-pretty-print.c (dump_gimple_omp_for): Added a new case for
	GF_OMP_FOR_KIND_CILKFOR.  Also emitted "_Cilk_for" instead of "for ("
	for when the gimple kind is GF_OMP_FOR_KIND_CILKFOR.
	* gimple.h (enum gf_mask): Added a new field GF_OMP_FOR_KIND_CILKFOR.
	Re-arranged couple other fields to make them all in ascending order.
	(struct gimple_omp_for_iter): Added a new field called "grain."
	(gimple_cilk_for_set_grain): New function.
	(gimple_cilk_for_induction_var): Likewise.
	(gimple_cilk_for_grain): Likewise.
	* gimplify.c (gimplify_omp_for): Added code to handle gimplification
	of a _Cilk_for statement.
	* omp-low.c (struct cilk_for_information): New structure.
	(create_omp_child_function_name): Added a new bool parameter called
	is_cilk_for.  If this is set, then use a different suffix.
	(extract_omp_for_data): Added a check for _Cilk_for's kind for a
	NE_EXPR case.  Added the correct schedule type for _Cilk_for.
	(use_pointer_for_field): Reject using of pointers for the induction
	variable of the outer function.
	(find_cilk_for_stmt): New function.
	(is_cilk_for_stmt): Likewise.
	(cilk_for_check_loop_diff_type): Likewise.
	(expand_cilk_for_body): Likewise.
	(is_cilk_loop_var): Likewise.
	(cilk_find_field_value): Likewise.
	(cilk_find_component_expr): Likewise.
	(find_cilk_for_vars): Likewise.
	(insert_cilk_for_fn_call): Likewise.
	(create_omp_child_function): Added two new parameters to pass in
	whether it is a _Cilk_for body and the induction variable type.  If
	it is _Cilk_for, then create two new parameters and different function-
	type.
	(lower_rec_input_clauses): Set the new decl expr value to the
	variable for the "__cilk_init," "__cilk_cond" and "__cilk_incr"
	variables.
	(scan_omp_parallel): Added a check if the outer statement is a
	_Cilk_for and if so, then find the correct induction variable type to
	pass them into create_omp_child_function.
	(expand_omp_taskreg): Added code to extract the high and low parameters
	from the child function and then insert it in the appropriate location.
	Added a call to expand_cilk_for_body.  Allowed the insertion of the
	library calls when the taskreg being expanded is not a _Cilk_for.
	(expand_omp_for): Added a check for GF_OMP_FOR_KIND_CILKFOR for the
	for statement's kind.  If so then call insert_cilk_for_fn_call.
	* tree-core.h (enum omp_clause_schedule_kind): Added a new field
	OMP_CLAUSE_SCHEDULE_CILK_FOR.
	* tree.def (CILK_FOR): New tree.

gcc/c-family/ChangeLog
2014-01-06  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-omp.c (c_finish_omp_for): Added a check for CILK_FOR along with
	CILK_SIMD.
	* c-common.h (enum rid): Added new value called "RID_CILK_FOR."
	* c-common.c (c_common_reswords[]): Added a new field "_Cilk_for."
	* c-pragma.c (init_pragma): Added cilk grainsize pragma.
	* c-pragma.h (enum pragma_kind): Added PRAGMA_CILK_GRAINSIZE.

gcc/c/ChangeLog
2014-01-06  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-parser.c (c_parser_statement_after_labels): Added a RID_CILK_FOR
	case.
	(c_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(c_parser_omp_for_loop): Renamed the "clauses" parameter to
	"clauses_or_grain."  Added handling for _Cilk_for statements.  Set
	the grain value to the clauses location.
	(c_parser_cilk_grainsize): New function.
	(c_parser_cilk_simd): Added a new parameter called grain.  Also added
	support to parse _Cilk_for statements.

gcc/testsuite/ChangeLog
2014-01-06  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk-fors.c: New testcase.
	* c-c++-common/cilk-plus/CK/nested_cilk_for.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c: Likewise.

[-- Attachment #5: cp-ChangeLog --]
[-- Type: application/octet-stream, Size: 1532 bytes --]

gcc/cp/ChangeLog
2014-01-06  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* parser.c (cp_parser_statement): Added a RID_CILK_FOR case.
	(cp_parser_omp_for_cond): Included a check for CILK_FOR along with
	CILK_SIMD.
	(cp_parser_omp_for_loop): Overall, added support to parse _Cilk_for
	statement along with omp for statements.
	(cp_parser_cilk_grainsize): New function.
	(cp_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(cp_parser_cilk_simd): Added a new parameter for grain.  Added support
	to handle _Cilk_for loops along with #pragma simd for loops.
	* pt.c (tsubst_expr): Added CILK_FOR case.  If the tree is CILK_FOR
	then just RECUR its clauses, instead of calling tsubst_omp_clauses.
	* semantics.c (handle_omp_for_class_iterator): Added 2 new parameters.
	Added a NE_EXPR case.  Added a check for _Cilk_for statement and
	if so, then give a name for the new induction variable.
	(finish_omp_for): Added a check if the code is _Cilk_for and if true
	then insert all the iterator temporary variables into the _Cilk_for
	body.

gcc/testsuite/ChangeLog
2014-01-06  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Made certain error
	tags C specific and inserted their C++ equivalents.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* g++.dg/cilk-plus/CK/cilk-for-tplt.cc: New testcase.
	* g++.dg/cilk-plus/CK/stl_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_rev_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_test.cc: Likewise.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2013-12-18  0:22                         ` Iyer, Balaji V
@ 2014-01-07 20:40                           ` Jason Merrill
  2014-01-07 21:24                             ` Iyer, Balaji V
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Merrill @ 2014-01-07 20:40 UTC (permalink / raw)
  To: Iyer, Balaji V, 'Jeff Law', 'Aldy Hernandez'
  Cc: 'gcc-patches@gcc.gnu.org', 'rth@redhat.com',
	'Jakub Jelinek'

On 12/17/2013 07:21 PM, Iyer, Balaji V wrote:
> The reason why I store it in OMP_FOR_CLAUSE is because OMP clauses cannot occur in _Cilk_for. So adding a new clause seem to be an overkill IMHO. I need a place to store the grain value and so I chose this spot.

But code expects OMP_FOR_CLAUSES to have a certain form, and you are 
violating that so that now code needs to check whether we're dealing 
with a for loop in order to know to parse OMP_FOR_CLAUSES.  Doing it 
your way requires lots of little special cases.  Please represent it as 
a clause.

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2014-01-07 20:40                           ` Jason Merrill
@ 2014-01-07 21:24                             ` Iyer, Balaji V
  2014-01-07 21:29                               ` Jakub Jelinek
  0 siblings, 1 reply; 42+ messages in thread
From: Iyer, Balaji V @ 2014-01-07 21:24 UTC (permalink / raw)
  To: Jason Merrill, 'Jeff Law', 'Aldy Hernandez'
  Cc: 'gcc-patches@gcc.gnu.org', 'rth@redhat.com',
	'Jakub Jelinek'



> -----Original Message-----
> From: Jason Merrill [mailto:jason@redhat.com]
> Sent: Tuesday, January 7, 2014 3:41 PM
> To: Iyer, Balaji V; 'Jeff Law'; 'Aldy Hernandez'
> Cc: 'gcc-patches@gcc.gnu.org'; 'rth@redhat.com'; 'Jakub Jelinek'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On 12/17/2013 07:21 PM, Iyer, Balaji V wrote:
> > The reason why I store it in OMP_FOR_CLAUSE is because OMP clauses
> cannot occur in _Cilk_for. So adding a new clause seem to be an overkill
> IMHO. I need a place to store the grain value and so I chose this spot.
> 
> But code expects OMP_FOR_CLAUSES to have a certain form, and you are
> violating that so that now code needs to check whether we're dealing with a
> for loop in order to know to parse OMP_FOR_CLAUSES.  Doing it your way
> requires lots of little special cases.  Please represent it as a clause.

Hi Jason,
	In gimplify_omp_for, I remove the information in OMP_FOR_CLAUSES () and then replace it with a NULL_TREE. Till that point, nothing steps on it (except in pt.c and that I am handling it).
	Then the grain value is stored in gimple tree for omp_for.

Thanks,

Balaji V. Iyer.

> 
> Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2014-01-07 21:24                             ` Iyer, Balaji V
@ 2014-01-07 21:29                               ` Jakub Jelinek
  2014-01-07 22:12                                 ` Iyer, Balaji V
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Jelinek @ 2014-01-07 21:29 UTC (permalink / raw)
  To: Iyer, Balaji V
  Cc: Jason Merrill, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

On Tue, Jan 07, 2014 at 09:24:21PM +0000, Iyer, Balaji V wrote:
> > -----Original Message-----
> > From: Jason Merrill [mailto:jason@redhat.com]
> > Sent: Tuesday, January 7, 2014 3:41 PM
> > To: Iyer, Balaji V; 'Jeff Law'; 'Aldy Hernandez'
> > Cc: 'gcc-patches@gcc.gnu.org'; 'rth@redhat.com'; 'Jakub Jelinek'
> > Subject: Re: [PATCH] _Cilk_for for C and C++
> > 
> > On 12/17/2013 07:21 PM, Iyer, Balaji V wrote:
> > > The reason why I store it in OMP_FOR_CLAUSE is because OMP clauses
> > cannot occur in _Cilk_for. So adding a new clause seem to be an overkill
> > IMHO. I need a place to store the grain value and so I chose this spot.
> > 
> > But code expects OMP_FOR_CLAUSES to have a certain form, and you are
> > violating that so that now code needs to check whether we're dealing with a
> > for loop in order to know to parse OMP_FOR_CLAUSES.  Doing it your way
> > requires lots of little special cases.  Please represent it as a clause.
> 
> Hi Jason,
> 	In gimplify_omp_for, I remove the information in OMP_FOR_CLAUSES ()
> 	and then replace it with a NULL_TREE.  Till that point, nothing
> 	steps on it (except in pt.c and that I am handling it).  Then the
> 	grain value is stored in gimple tree for omp_for.

So, you are abusing OMP_FOR_CLAUSES for shorter time, still, I agree with
Jason that you shouldn't do that.

If you don't want to add a new clause, just use a similar existing one,
if grain is something like scheduling chunk size, just with a different
name for it, then using OMP_CLAUSE_SCHEDULE with OMP_CLAUSE_SCHEDULE_EXPR
being the grain expression would be certainly cleaner.
But even adding a new artificial clause isn't that hard.

	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2014-01-07 21:29                               ` Jakub Jelinek
@ 2014-01-07 22:12                                 ` Iyer, Balaji V
  2014-01-08 17:31                                   ` Jakub Jelinek
  0 siblings, 1 reply; 42+ messages in thread
From: Iyer, Balaji V @ 2014-01-07 22:12 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

[-- Attachment #1: Type: text/plain, Size: 2286 bytes --]



> -----Original Message-----
> From: Jakub Jelinek [mailto:jakub@redhat.com]
> Sent: Tuesday, January 7, 2014 4:29 PM
> To: Iyer, Balaji V
> Cc: Jason Merrill; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'rth@redhat.com'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On Tue, Jan 07, 2014 at 09:24:21PM +0000, Iyer, Balaji V wrote:
> > > -----Original Message-----
> > > From: Jason Merrill [mailto:jason@redhat.com]
> > > Sent: Tuesday, January 7, 2014 3:41 PM
> > > To: Iyer, Balaji V; 'Jeff Law'; 'Aldy Hernandez'
> > > Cc: 'gcc-patches@gcc.gnu.org'; 'rth@redhat.com'; 'Jakub Jelinek'
> > > Subject: Re: [PATCH] _Cilk_for for C and C++
> > >
> > > On 12/17/2013 07:21 PM, Iyer, Balaji V wrote:
> > > > The reason why I store it in OMP_FOR_CLAUSE is because OMP clauses
> > > cannot occur in _Cilk_for. So adding a new clause seem to be an
> > > overkill IMHO. I need a place to store the grain value and so I chose this
> spot.
> > >
> > > But code expects OMP_FOR_CLAUSES to have a certain form, and you
> are
> > > violating that so that now code needs to check whether we're dealing
> > > with a for loop in order to know to parse OMP_FOR_CLAUSES.  Doing it
> > > your way requires lots of little special cases.  Please represent it as a
> clause.
> >
> > Hi Jason,
> > 	In gimplify_omp_for, I remove the information in
> OMP_FOR_CLAUSES ()
> > 	and then replace it with a NULL_TREE.  Till that point, nothing
> > 	steps on it (except in pt.c and that I am handling it).  Then the
> > 	grain value is stored in gimple tree for omp_for.
> 
> So, you are abusing OMP_FOR_CLAUSES for shorter time, still, I agree with
> Jason that you shouldn't do that.
> 
> If you don't want to add a new clause, just use a similar existing one, if grain
> is something like scheduling chunk size, just with a different name for it, then
> using OMP_CLAUSE_SCHEDULE with OMP_CLAUSE_SCHEDULE_EXPR being
> the grain expression would be certainly cleaner.
> But even adding a new artificial clause isn't that hard.
> 

Hi Jason and Jakub,
	I used a similar existing one (safelen). Attached, please find 2 fixed patches for C and C++ along with their changelogs.

Is this OK for trunk?

Thanks,

Balaji V. Iyer.

> 	Jakub

[-- Attachment #2: diff_c++.txt --]
[-- Type: text/plain, Size: 21321 bytes --]

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c99c1fc..6ad35d0
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -237,8 +237,8 @@ static void cp_parser_initial_pragma
 static tree cp_literal_operator_id
   (const char *);
 
-static void cp_parser_cilk_simd
-  (cp_parser *, cp_token *);
+static tree cp_parser_cilk_simd
+  (cp_parser *, cp_token *, tree);
 static bool cp_parser_omp_declare_reduction_exprs
   (tree, cp_parser *);
 
@@ -9364,6 +9364,18 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 	  statement = cp_parser_iteration_statement (parser, false);
 	  break;
 
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (cp_lexer_peek_token (parser->lexer)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      cp_lexer_consume_token (parser->lexer);
+	      statement = error_mark_node;
+	    }
+	  else
+	    statement = cp_parser_cilk_simd (parser, NULL, integer_zero_node);
+	  break;
+
 	case RID_BREAK:
 	case RID_CONTINUE:
 	case RID_RETURN:
@@ -28694,7 +28706,7 @@ cp_parser_omp_for_cond (cp_parser *parser, tree decl, enum tree_code code)
     case LE_EXPR:
       break;
     case NE_EXPR:
-      if (code == CILK_SIMD)
+      if (code == CILK_SIMD || code == CILK_FOR)
 	break;
       /* Fall through: OpenMP disallows NE_EXPR.  */
     default:
@@ -29019,11 +29031,18 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       bool add_private_clause = false;
       location_t loc;
 
-      if (!cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
+      if (code == CILK_SIMD
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
 	{
 	  cp_parser_error (parser, "for statement expected");
 	  return NULL;
 	}
+      if (code == CILK_FOR
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_CILK_FOR))
+	{
+	  cp_parser_error (parser, "_Cilk_for statement expected");
+	  return NULL;
+	}
       loc = cp_lexer_consume_token (parser->lexer)->location;
 
       if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
@@ -29032,13 +29051,26 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       init = decl = real_decl = NULL;
       this_pre_body = push_stmt_list ();
 
+      if (code == CILK_FOR
+	  && cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC))
+	{
+	  error_at (cp_lexer_peek_token (parser->lexer)->location,
+		    "induction variable cannot be static");
+	  cp_lexer_consume_token (parser->lexer);
+	}
       add_private_clause
 	|= cp_parser_omp_for_loop_init (parser,
-					/*parsing_openmp=*/code != CILK_SIMD,
+					/*parsing_openmp=*/
+					(code != CILK_SIMD && code != CILK_FOR),
 					this_pre_body, for_block,
 					init, decl, real_decl);
 
-      cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
+      if (!cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON)
+	  && code == CILK_FOR)
+	{
+	  cp_parser_skip_to_end_of_statement (parser);
+	  cp_parser_consume_semicolon_at_end_of_statement (parser);
+	}
       if (this_pre_body)
 	{
 	  this_pre_body = pop_stmt_list (this_pre_body);
@@ -29196,17 +29228,30 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
 
   /* Note that we saved the original contents of this flag when we entered
      the structured block, and so we don't need to re-save it here.  */
-  if (code == CILK_SIMD)
+  if (code == CILK_SIMD || code == CILK_FOR)
     parser->in_statement = IN_CILK_SIMD_FOR;
   else
     parser->in_statement = IN_OMP_FOR;
 
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = begin_omp_parallel ();
+    }
+
   /* Note that the grammar doesn't call for a structured block here,
      though the loop as a whole is a structured block.  */
   body = push_stmt_list ();
   cp_parser_statement (parser, NULL_TREE, false, NULL);
   body = pop_stmt_list (body);
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = finish_omp_parallel (NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
   if (declv == NULL_TREE)
     ret = NULL_TREE;
   else
@@ -31084,6 +31129,38 @@ cp_parser_initial_pragma (cp_token *first_token)
   cp_lexer_get_preprocessor_token (NULL, first_token);
 }
 
+/* Parses the grainsize pragma for the _Cilk_for statement.
+   Syntax:
+   #pragma cilk grainsize = <VALUE>.  */
+
+static void
+cp_parser_cilk_grainsize (cp_parser *parser, cp_token *pragma_tok)
+{
+  if (cp_parser_require (parser, CPP_EQ, RT_EQ))
+    {
+      tree exp = cp_parser_binary_expression (parser, false, false,
+                                              PREC_NOT_OPERATOR, NULL);
+      cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+      if (!exp || exp == error_mark_node)
+        {
+          error_at (pragma_tok->location, "invalid grainsize for _Cilk_for");
+          return;
+        }
+      cp_token *n_tok = cp_lexer_peek_token (parser->lexer);
+
+      /* Make sure the next token is _Cilk_for, it is invalid otherwise.  */
+      if (n_tok && n_tok->type == CPP_KEYWORD
+	  && n_tok->keyword == RID_CILK_FOR)
+	cp_parser_cilk_simd (parser, NULL, exp);
+      else
+	warning_at (cp_lexer_peek_token (parser->lexer)->location, 0,
+		    "%<#pragma cilk grainsize%> is not followed by "
+		    "%<_Cilk_for%>");
+      return;
+    }
+  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+}
+
 /* Normal parsing of a pragma token.  Here we can (and must) use the
    regular lexer.  */
 
@@ -31263,9 +31340,30 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
 		    "%<#pragma simd%> must be inside a function");
 	  break;
 	}
-      cp_parser_cilk_simd (parser, pragma_tok);
+      cp_parser_cilk_simd (parser, pragma_tok, NULL_TREE);
       return true;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (context == pragma_external)
+        {
+          error_at (pragma_tok->location,
+                    "%<#pragma cilk grainsize%> must be inside a function");
+          break;
+        }
+
+      /* Ignore the pragma if Cilk Plus is not enabled.  */
+      if (flag_enable_cilkplus)
+        {
+          cp_parser_cilk_grainsize (parser, pragma_tok);
+          return true;
+        }
+      else
+        {
+          error_at (pragma_tok->location, "-fcilkplus must be enabled to use "
+                    "%<#pragma cilk grainsize%>");
+          break;
+	}
+      
     default:
       gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
       c_invoke_pragma_handler (id);
@@ -31555,31 +31653,63 @@ cp_parser_cilk_simd_all_clauses (cp_parser *parser, cp_token *pragma_token)
     return c_finish_cilk_clauses (clauses);
 }
 
-/* Main entry-point for parsing Cilk Plus <#pragma simd> for loops.  */
+/* Main entry-point for parsing Cilk Plus <#pragma simd> for and _Cilk_for
+   loops.  This function returns NULL_TREE whenever it is parsing the
+   #pragma simd's for because the caller does not check the return value.
+   _Cilk_for's caller checks this value and thus return error_mark_node
+   when errors happen and a valid value when things go as expected.  */
 
-static void
-cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token)
+static tree
+cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token, tree grain)
 {
-  tree clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
+  bool is_cilk_for = !pragma_token ? true: false;
+  tree clauses = NULL_TREE;
+
+  if (!is_cilk_for)
+    clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
 
   if (clauses == error_mark_node)
-    return;
-  
-  if (cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
+    return NULL_TREE;
+
+  if (!is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
     {
       error_at (cp_lexer_peek_token (parser->lexer)->location,
 		"for statement expected");
-      return;
+      return NULL_TREE;
+    }
+  if (is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_CILK_FOR))
+    {
+      error_at (cp_lexer_peek_token (parser->lexer)->location,
+		"_Cilk_for statement expected");
+      return error_mark_node;
     }
 
   tree sb = begin_omp_structured_block ();
   int save = cp_parser_begin_omp_structured_block (parser);
-  tree ret = cp_parser_omp_for_loop (parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  tree ret = cp_parser_omp_for_loop (parser, code, clauses, NULL);
   if (ret)
     cpp_validate_cilk_plus_loop (OMP_FOR_BODY (ret));
+
+  /* For _Cilk_for statements, the grain value is stored in the same
+     location as clauses for OMP for.  */
+  if (is_cilk_for && ret)
+    { 
+      tree l = build_omp_clause (EXPR_LOCATION (grain),
+				 OMP_CLAUSE_SAFELEN);
+      OMP_CLAUSE_SAFELEN_EXPR (l) = grain;
+      OMP_CLAUSE_CHAIN (l) = NULL_TREE;
+      OMP_FOR_CLAUSES (ret) = l;
+    }
+
   cp_parser_end_omp_structured_block (parser, save);
-  add_stmt (finish_omp_structured_block (sb));
-  return;
+  tree stmt = finish_omp_structured_block (sb);
+  add_stmt (stmt);
+  if (is_cilk_for) 
+    return stmt;
+  return NULL_TREE;
 }
 
 /* Create an identifier for a generic parameter type (a synthesized
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 98d7365..99d092b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13575,6 +13575,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
     case OMP_FOR:
     case OMP_SIMD:
     case CILK_SIMD:
+    case CILK_FOR:
     case OMP_DISTRIBUTE:
       {
 	tree clauses, body, pre_body;
@@ -13582,8 +13583,22 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
 	tree incrv = NULL_TREE;
 	int i;
 
-	clauses = tsubst_omp_clauses (OMP_FOR_CLAUSES (t), false,
-				      args, complain, in_decl);
+	/* We cannot use the tsubst_omp_clauses since it will try to
+	   do checking such as whether a certain clause can be used
+	   with a certain for-loop.  We are just use safelen clause here 
+	   as a holder to hold the grain value.  */
+	if (TREE_CODE (t) == CILK_FOR)
+	  {
+	    tree l = OMP_FOR_CLAUSES (t);
+	    l = RECUR (OMP_CLAUSE_SAFELEN_EXPR (l));
+	    clauses = build_omp_clause (EXPR_LOCATION (l),
+					OMP_CLAUSE_SAFELEN);
+	    OMP_CLAUSE_SAFELEN_EXPR (clauses) = l;
+	    OMP_CLAUSE_CHAIN (clauses) = NULL_TREE;
+	  } 
+	else
+	  clauses = tsubst_omp_clauses (OMP_FOR_CLAUSES (t), false,
+					args, complain, in_decl);
 	if (OMP_FOR_INIT (t) != NULL_TREE)
 	  {
 	    declv = make_tree_vec (TREE_VEC_LENGTH (OMP_FOR_INIT (t)));
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 0bb64c7..cc1a013 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -5965,7 +5965,8 @@ finish_omp_task (tree clauses, tree body)
 static bool
 handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
 			       tree condv, tree incrv, tree *body,
-			       tree *pre_body, tree clauses)
+			       tree *pre_body, tree clauses,
+			       bool is_cilk_for)
 {
   tree diff, iter_init, iter_incr = NULL, last;
   tree incr_var = NULL, orig_pre_body, orig_body, c;
@@ -5985,6 +5986,7 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
     case GE_EXPR:
     case LT_EXPR:
     case LE_EXPR:
+    case NE_EXPR:
       if (TREE_OPERAND (cond, 1) == iter)
 	cond = build2 (swap_tree_comparison (TREE_CODE (cond)),
 		       TREE_TYPE (cond), iter, TREE_OPERAND (cond, 0));
@@ -6128,6 +6130,11 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
       break;
 
   decl = create_temporary_var (TREE_TYPE (diff));
+  /* In _Cilk_for we must know the induction variable name since it is
+     read by expand_cilk_for_body in omp-low.c to set the induction
+     variable in the child function correctly.  */
+  if (is_cilk_for)
+    DECL_NAME (decl) = make_anon_name ();
   pushdecl (decl);
   add_decl_expr (decl);
   last = create_temporary_var (TREE_TYPE (diff));
@@ -6343,8 +6350,24 @@ finish_omp_for (location_t locus, enum tree_code code, tree declv, tree initv,
 				"iteration variable %qE", decl);
 	      return NULL;
 	    }
-	  if (handle_omp_for_class_iterator (i, locus, declv, initv, condv,
-					     incrv, &body, &pre_body, clauses))
+
+	  /* In _Cilk_for, all the iterator mapping code should be
+	     inserted in the OMP_PARALLEL_BODY.  */
+	  if (code == CILK_FOR)
+	    {
+	      tree the_body = OMP_PARALLEL_BODY (body);
+	      if (TREE_CODE (the_body) == BIND_EXPR)
+		the_body = BIND_EXPR_BODY (the_body);
+	      if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						 condv, incrv, &the_body,
+						 &pre_body, clauses, true))
+		return NULL;
+	      else
+		BIND_EXPR_BODY (OMP_PARALLEL_BODY (body)) = the_body;
+	    }
+	  else if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						  condv, incrv, &body,
+						  &pre_body, clauses, false))
 	    return NULL;
 	  continue;
 	}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
index 0ebc09a..ed73c34 100644
--- a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -7,7 +7,8 @@ int main (void)
 {
   int q = 0, ii = 0, jj = 0;
 
-  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */
+    /* { dg-error "expected" "" { target c++ } 10 } */
     q = 5;
 
   _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
@@ -16,24 +17,30 @@ int main (void)
   _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
     q = 2;
 
-  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" "" { target c } } */
+    /* { dg-error "invalid controlling predicate" "" { target c++ }  20 } */
     q = 5;
 
   _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
     q = 5;
 
-  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected" } */ 
     q = 5;
 
   _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
     q = 5;
 
-  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static variable" } */
+ 
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static" } */
+
     q = 5;
 
+
   _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
     q = 5;
 
+
   _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
     q = 5;
 
@@ -43,7 +50,9 @@ int main (void)
   _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
     q = 5;
 
-  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */ 
+    /* { dg-error "expected" "" { target c++ }  53 } */
     q = 5;
+
   return 0;
 }
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
index ff8bc0a..e1e3217 100644
--- a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -29,13 +29,13 @@ int main(int argc, char **argv)
     Array1[ii] = 0;
 
 #pragma cilk grainsize = 1 
-  while (Array1[5] != 0) /* { dg-warning "grainsize pragma is not followed" } */
+  while (Array1[5] != 0) /* { dg-warning "is not followed by" } */
     {
     /* Blah */
     }
 
 #pragma cilk grainsize = 1 
-  int q = 0; /* { dg-warning "grainsize pragma is not followed" } */
+  int q = 0; /* { dg-warning "is not followed by" } */
   _Cilk_for (q = 0; q < 10; q++)
     Array1[q]  = 5;
 
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
new file mode 100644
index 0000000..8221371
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
@@ -0,0 +1,25 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define SIZE 100
+#define CHECK_VALUE 5
+
+template <class T>
+int func (T start, T end)
+{
+  int Array[SIZE];
+  _Cilk_for (T ii = 0; ii < end; ii++)
+    Array[ii] = CHECK_VALUE;
+  
+  for (T ii = 0; ii < end; ii++)
+    if (Array[ii] != CHECK_VALUE)
+      __builtin_abort ();
+
+  return 0;
+}
+
+int main (void)
+{
+  return func <int> (0, 100) + func <long> (0, 100);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
new file mode 100644
index 0000000..2ac8c72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
@@ -0,0 +1,52 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array;
+vector <int> array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back (ii);
+}
+#endif
+_Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end();
+          iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+for (vector<int>::iterator iter = array_serial.begin(); 
+     iter != array_serial.end(); iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
new file mode 100644
index 0000000..1cf3301
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
@@ -0,0 +1,72 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array,array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back(ii);
+}
+#endif
+_Cilk_for (vector<int>::reverse_iterator iter4 = array.rbegin(); 
+	   iter4 != array.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+
+_Cilk_for (vector<int>::reverse_iterator iter4 = array_serial.rbegin(); 
+	   iter4 != array_serial.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+_Cilk_for (vector<int>::reverse_iterator iter2 = array.rbegin(); 
+	   iter2 != array.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+for (vector<int>::reverse_iterator iter2 = array_serial.rbegin(); 
+     iter2 != array_serial.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
+
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
new file mode 100644
index 0000000..3e350a1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
@@ -0,0 +1,50 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <iostream>
+#include <cstdio>
+#include <cstdlib>
+#include <vector>
+#include <algorithm>
+#include <list>
+
+using namespace std;
+
+
+int main(int argc, char **argv)
+{
+  vector <int> number_list, number_list_serial;
+  int new_number = 0;
+  int no_elements = 0;
+  
+  if (argc != 2)
+  {
+    no_elements = 10000;
+  }
+
+
+  number_list.clear();
+  number_list_serial.clear();
+  for (int ii = 0; ii < no_elements; ii++)
+  {
+    number_list.push_back(new_number);
+    number_list_serial.push_back(new_number);
+  }
+
+  _Cilk_for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list[jj] = jj + no_elements;
+  }
+  for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list_serial[jj] = jj + no_elements;
+  }
+
+  for (int jj = 0; jj < no_elements; jj++)
+    if (number_list_serial[jj] != number_list[jj])
+      abort ();
+
+  return 0;
+}

[-- Attachment #3: c-ChangeLog --]
[-- Type: application/octet-stream, Size: 4350 bytes --]

gcc/ChangeLog
2014-01-07  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* cilk-builtins.def: Added two new builtin functions called
	__cilkrts_cilk_for_32 and __cilkrts_cilk_for_64.
	* cilk-common.c (cilk_init_builtins): Likewise.
	(cilk_declare_looper): New function.
	* cilk.h (enum cilk_tree_index): Added two new fields called
	CILK_TI_F_LOOP_32 and CILK_TI_F_LOOP_64.
	(cilk_for_32_fndecl): New #define.
	(cilk_for_64_fndecl): Likewise.
	* gimple-pretty-print.c (dump_gimple_omp_for): Added a new case for
	GF_OMP_FOR_KIND_CILKFOR.  Also emitted "_Cilk_for" instead of "for ("
	for when the gimple kind is GF_OMP_FOR_KIND_CILKFOR.
	* gimple.h (enum gf_mask): Added a new field GF_OMP_FOR_KIND_CILKFOR.
	Re-arranged couple other fields to make them all in ascending order.
	(struct gimple_omp_for_iter): Added a new field called "grain."
	(gimple_cilk_for_set_grain): New function.
	(gimple_cilk_for_induction_var): Likewise.
	(gimple_cilk_for_grain): Likewise.
	* gimplify.c (gimplify_omp_for): Added code to handle gimplification
	of a _Cilk_for statement.
	* omp-low.c (struct cilk_for_information): New structure.
	(create_omp_child_function_name): Added a new bool parameter called
	is_cilk_for.  If this is set, then use a different suffix.
	(extract_omp_for_data): Added a check for _Cilk_for's kind for a
	NE_EXPR case.  Added the correct schedule type for _Cilk_for.
	(use_pointer_for_field): Reject using of pointers for the induction
	variable of the outer function.
	(find_cilk_for_stmt): New function.
	(is_cilk_for_stmt): Likewise.
	(cilk_for_check_loop_diff_type): Likewise.
	(expand_cilk_for_body): Likewise.
	(is_cilk_loop_var): Likewise.
	(cilk_find_field_value): Likewise.
	(cilk_find_component_expr): Likewise.
	(find_cilk_for_vars): Likewise.
	(insert_cilk_for_fn_call): Likewise.
	(create_omp_child_function): Added two new parameters to pass in
	whether it is a _Cilk_for body and the induction variable type.  If
	it is _Cilk_for, then create two new parameters and different function-
	type.
	(lower_rec_input_clauses): Set the new decl expr value to the
	variable for the "__cilk_init," "__cilk_cond" and "__cilk_incr"
	variables.
	(scan_omp_parallel): Added a check if the outer statement is a
	_Cilk_for and if so, then find the correct induction variable type to
	pass them into create_omp_child_function.
	(expand_omp_taskreg): Added code to extract the high and low parameters
	from the child function and then insert it in the appropriate location.
	Added a call to expand_cilk_for_body.  Allowed the insertion of the
	library calls when the taskreg being expanded is not a _Cilk_for.
	(expand_omp_for): Added a check for GF_OMP_FOR_KIND_CILKFOR for the
	for statement's kind.  If so then call insert_cilk_for_fn_call.
	* tree-core.h (enum omp_clause_schedule_kind): Added a new field
	OMP_CLAUSE_SCHEDULE_CILK_FOR.
	* tree.def (CILK_FOR): New tree.

gcc/c-family/ChangeLog
2014-01-07  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-omp.c (c_finish_omp_for): Added a check for CILK_FOR along with
	CILK_SIMD.
	* c-common.h (enum rid): Added new value called "RID_CILK_FOR."
	* c-common.c (c_common_reswords[]): Added a new field "_Cilk_for."
	* c-pragma.c (init_pragma): Added cilk grainsize pragma.
	* c-pragma.h (enum pragma_kind): Added PRAGMA_CILK_GRAINSIZE.

gcc/c/ChangeLog
2014-01-07  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-parser.c (c_parser_statement_after_labels): Added a RID_CILK_FOR
	case.
	(c_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(c_parser_omp_for_loop): Renamed the "clauses" parameter to
	"clauses_or_grain."  Added handling for _Cilk_for statements.  Set
	the grain value to the clauses location.
	(c_parser_cilk_grainsize): New function.
	(c_parser_cilk_simd): Added a new parameter called grain.  Also added
	support to parse _Cilk_for statements.

gcc/testsuite/ChangeLog
2014-01-07  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk-fors.c: New testcase.
	* c-c++-common/cilk-plus/CK/nested_cilk_for.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c: Likewise.

[-- Attachment #4: cp-ChangeLog --]
[-- Type: application/octet-stream, Size: 1532 bytes --]

gcc/cp/ChangeLog
2014-01-07  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* parser.c (cp_parser_statement): Added a RID_CILK_FOR case.
	(cp_parser_omp_for_cond): Included a check for CILK_FOR along with
	CILK_SIMD.
	(cp_parser_omp_for_loop): Overall, added support to parse _Cilk_for
	statement along with omp for statements.
	(cp_parser_cilk_grainsize): New function.
	(cp_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(cp_parser_cilk_simd): Added a new parameter for grain.  Added support
	to handle _Cilk_for loops along with #pragma simd for loops.
	* pt.c (tsubst_expr): Added CILK_FOR case.  If the tree is CILK_FOR
	then just RECUR its clauses, instead of calling tsubst_omp_clauses.
	* semantics.c (handle_omp_for_class_iterator): Added 2 new parameters.
	Added a NE_EXPR case.  Added a check for _Cilk_for statement and
	if so, then give a name for the new induction variable.
	(finish_omp_for): Added a check if the code is _Cilk_for and if true
	then insert all the iterator temporary variables into the _Cilk_for
	body.

gcc/testsuite/ChangeLog
2014-01-07  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Made certain error
	tags C specific and inserted their C++ equivalents.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* g++.dg/cilk-plus/CK/cilk-for-tplt.cc: New testcase.
	* g++.dg/cilk-plus/CK/stl_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_rev_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_test.cc: Likewise.


[-- Attachment #5: diff_c.txt --]
[-- Type: text/plain, Size: 64484 bytes --]

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 40d12bc..9d24691
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -414,6 +414,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Complex",		RID_COMPLEX,	0 },
   { "_Cilk_spawn",      RID_CILK_SPAWN, 0 },
   { "_Cilk_sync",       RID_CILK_SYNC,  0 },
+  { "_Cilk_for",        RID_CILK_FOR,   0 },
   { "_Imaginary",	RID_IMAGINARY, D_CONLY },
   { "_Decimal32",       RID_DFLOAT32,  D_CONLY | D_EXT },
   { "_Decimal64",       RID_DFLOAT64,  D_CONLY | D_EXT },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7e3ece6..0eaebf3
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -149,7 +149,7 @@ enum rid
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
   /* Cilk Plus keywords.  */
-  RID_CILK_SPAWN, RID_CILK_SYNC,
+  RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR,
   
   /* Objective-C ("AT" reserved words - they are only keywords when
      they follow '@')  */
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index ac380ee..b15cd4c
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -386,7 +386,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
   bool fail = false;
   int i;
 
-  if (code == CILK_SIMD
+  if ((code == CILK_SIMD || code == CILK_FOR) 
       && !c_check_cilk_loop (locus, TREE_VEC_ELT (declv, 0)))
     fail = true;
 
@@ -516,7 +516,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 					    0))
 		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
 					 ? LT_EXPR : GE_EXPR);
-		  else if (code != CILK_SIMD)
+		  else if (code != CILK_SIMD && code != CILK_FOR)
 		    cond_ok = false;
 		}
 	    }
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index af28085..6f22148
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1394,6 +1394,11 @@ init_pragma (void)
 
   cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false,
 				false);
+
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    cpp_register_deferred_pragma (parse_in, "cilk", "grainsize",
+				  PRAGMA_CILK_GRAINSIZE, true, false);
+
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, "pack", handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 6f1bf74..b9f09ba
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -55,6 +55,9 @@ typedef enum pragma_kind {
   /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
   PRAGMA_CILK_SIMD,
 
+  /* This pragma handles setting of grainsize for a _Cilk_for.  */
+  PRAGMA_CILK_GRAINSIZE,
+
   PRAGMA_GCC_PCH_PREPROCESS,
   PRAGMA_IVDEP,
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index f73df08..f0320ec
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1248,10 +1248,11 @@ static bool c_parser_objc_diagnose_bad_element_prefix
   (c_parser *, struct c_declspecs *);
 
 /* Cilk Plus supporting routines.  */
-static void c_parser_cilk_simd (c_parser *);
+static void c_parser_cilk_simd (c_parser *, tree);
 static bool c_parser_cilk_verify_simd (c_parser *, enum pragma_context);
 static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
 static tree c_parser_cilk_clause_vectorlength (c_parser *, tree, bool);
+static void c_parser_cilk_grainsize (c_parser *);
 
 /* Parse a translation unit (C90 6.7, C99 6.9).
 
@@ -4878,6 +4879,16 @@ c_parser_statement_after_labels (c_parser *parser)
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
 	  break;
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (c_parser_peek_token (parser)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      c_parser_skip_to_end_of_block_or_statement (parser);
+	    }
+	  else
+	    c_parser_cilk_simd (parser, integer_zero_node);
+	  break;
 	case RID_CILK_SYNC:
 	  c_parser_consume_token (parser);
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
@@ -9488,7 +9499,25 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       if (!c_parser_cilk_verify_simd (parser, context))
 	return false;
       c_parser_consume_pragma (parser);
-      c_parser_cilk_simd (parser);
+      c_parser_cilk_simd (parser, NULL_TREE);
+      return false;
+
+    case PRAGMA_CILK_GRAINSIZE:
+      if (!flag_enable_cilkplus)
+	{
+	  warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not"
+		   " enabled");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      if (context == pragma_external)
+	{
+	  error_at (c_parser_peek_token (parser)->location,
+		    "%<#pragma grainsize%> must be inside a function");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      c_parser_cilk_grainsize (parser);
       return false;
 
     default:
@@ -11583,7 +11612,7 @@ c_parser_omp_flush (c_parser *parser)
 
 static tree
 c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
-		       tree clauses, tree *cclauses)
+		       tree clauses_or_grain, tree *cclauses)
 {
   tree decl, cond, incr, save_break, save_cont, body, init, stmt, cl;
   tree declv, condv, incrv, initv, ret = NULL;
@@ -11591,6 +11620,9 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   int i, collapse = 1, nbraces = 0;
   location_t for_loc;
   vec<tree, va_gc> *for_block = make_tree_vector ();
+  tree clauses = code == CILK_FOR ? NULL_TREE : clauses_or_grain;
+  tree grain = code == CILK_FOR ? clauses_or_grain : NULL_TREE;
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
 
   for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl))
     if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE)
@@ -11603,11 +11635,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   condv = make_tree_vec (collapse);
   incrv = make_tree_vec (collapse);
 
-  if (!c_parser_next_token_is_keyword (parser, RID_FOR))
+  if (code != CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_FOR))
     {
       c_parser_error (parser, "for statement expected");
       return NULL;
     }
+  if (code == CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_CILK_FOR))
+    {
+      c_parser_error (parser, "_Cilk_for statement expected");
+      return NULL;
+    }
   for_loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
 
@@ -11685,7 +11724,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	    case LE_EXPR:
 	      break;
 	    case NE_EXPR:
-	      if (code == CILK_SIMD)
+	      if (code == CILK_SIMD || code == CILK_FOR)
 		break;
 	      /* FALLTHRU.  */
 	    default:
@@ -11767,6 +11806,12 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
     c_break_label = size_one_node;
   save_cont = c_cont_label;
   c_cont_label = NULL_TREE;
+
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = c_begin_omp_parallel ();
+    }
   body = push_stmt_list ();
 
   if (open_brace_parsed)
@@ -11814,6 +11859,13 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	}
     }
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = c_finish_omp_parallel (loc, NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
+
   /* Only bother calling c_finish_omp_for if we haven't already generated
      an error from the initialization parsing.  */
   if (!fail)
@@ -11859,6 +11911,17 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 		  }
 	    }
 	  OMP_FOR_CLAUSES (stmt) = clauses;
+	  /* If it is a _Cilk_for statement, then the OMP_FOR_CLAUSES location
+	     stores the user-defined grain value or an integer_zero_node 
+	     indicating that the runtime must compute a suitable grain.  */
+	  if (code == CILK_FOR)
+	    {
+	      tree l = build_omp_clause (EXPR_LOCATION (grain),
+					 OMP_CLAUSE_SAFELEN);
+	      OMP_CLAUSE_SAFELEN_EXPR (l) = grain;
+	      OMP_CLAUSE_CHAIN (l) = NULL_TREE;
+	      OMP_FOR_CLAUSES (stmt) = l;
+	    }
 	}
       ret = stmt;
     }
@@ -13762,16 +13825,65 @@ c_parser_cilk_all_clauses (c_parser *parser)
   return c_finish_cilk_clauses (clauses);
 }
 
+/* This function helps parse the grainsize pragma for a _Cilk_for statement. 
+   Here is the correct syntax of this pragma: 
+	    #pragma cilk grainsize = <EXP>  */
+
+static void
+c_parser_cilk_grainsize (c_parser *parser)
+{
+  extern tree convert_to_integer (tree, tree);
+
+  /* consume the 'grainsize' keyword.  */
+  c_parser_consume_pragma (parser);
+
+  if (c_parser_require (parser, CPP_EQ, "expected %<=%>") != 0)
+    {
+      struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL);
+      if (g_expr.value && TREE_CODE (g_expr.value) == C_MAYBE_CONST_EXPR)
+	{
+	  error_at (input_location, "cannot convert grain to long integer.\n");
+	  c_parser_skip_to_pragma_eol (parser);
+	}   
+      else if (g_expr.value && g_expr.value != error_mark_node)
+	{
+	  c_parser_skip_to_pragma_eol (parser);
+	  c_token *token = c_parser_peek_token (parser);
+	  if (token && token->type == CPP_KEYWORD
+	      && token->keyword == RID_CILK_FOR)
+	    {
+	      tree grain = convert_to_integer (long_integer_type_node,
+					       g_expr.value);
+	      if (grain && grain != error_mark_node) 
+		c_parser_cilk_simd (parser, grain);
+	    }
+	  else
+	    warning (0, "grainsize pragma is not followed by %<_Cilk_for%>");
+	}
+      else
+	c_parser_skip_to_pragma_eol (parser);
+    }
+  else
+    c_parser_skip_to_pragma_eol (parser);
+}
+
 /* Main entry point for parsing Cilk Plus <#pragma simd> for
    loops.  */
 
 static void
-c_parser_cilk_simd (c_parser *parser)
+c_parser_cilk_simd (c_parser *parser, tree grain)
 {
-  tree clauses = c_parser_cilk_all_clauses (parser);
+  bool is_cilk_for = grain == NULL_TREE ? false : true;
+  tree clauses = NULL_TREE;
+  
+  if (!is_cilk_for)
+    clauses = c_parser_cilk_all_clauses (parser);
+  else 
+    clauses = grain;
   tree block = c_begin_compound_stmt (true);
   location_t loc = c_parser_peek_token (parser)->location;
-  c_parser_omp_for_loop (loc, parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  c_parser_omp_for_loop (loc, parser, code, clauses, NULL);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 }
diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def
index 9f3240a..bf319d5
--- a/gcc/cilk-builtins.def
+++ b/gcc/cilk-builtins.def
@@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64")
diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c
index afe88c9..bc1092b
--- a/gcc/cilk-common.c
+++ b/gcc/cilk-common.c
@@ -106,6 +106,26 @@ install_builtin (const char *name, tree fntype, enum built_in_function code,
   return fndecl;
 }
 
+/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
+
+static tree
+cilk_declare_looper (const char *name, tree type, enum built_in_function code)
+{
+  tree cb, ft, fn;
+
+  cb = build_function_type_list (void_type_node,
+                                 ptr_type_node, type, type,
+                                 NULL_TREE);
+  cb = build_pointer_type (cb);
+  ft = build_function_type_list (void_type_node,
+                                 cb, ptr_type_node, type,
+                                 integer_type_node, NULL_TREE);
+  fn = install_builtin (name, ft, code, false);
+  TREE_NOTHROW (fn) = 0;
+
+  return fn;
+}
+
 /* Creates and initializes all the built-in Cilk keywords functions and three
    structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker.
    Detailed information about __cilkrts_stack_frame and
@@ -269,6 +289,15 @@ cilk_init_builtins (void)
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 
 					 fptr_fun, BUILT_IN_CILK_SAVE_FP,
 					 false);
+  /* __cilkrts_cilk_for_32 (...);  */
+  cilk_for_32_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_32",
+                                            unsigned_intSI_type_node,
+                                            BUILT_IN_CILK_FOR_32);
+  /* __cilkrts_cilk_for_64 (...);  */
+  cilk_for_64_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_64",
+                                            unsigned_intDI_type_node,
+                                            BUILT_IN_CILK_FOR_64);
+
 }
 
 /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR.  */
diff --git a/gcc/cilk.h b/gcc/cilk.h
index d2ae931..0e98998
--- a/gcc/cilk.h
+++ b/gcc/cilk.h
@@ -40,6 +40,9 @@ enum cilk_tree_index  {
   CILK_TI_F_POP,                      /* __cilkrts_pop_frame (...).  */
   CILK_TI_F_RETHROW,                  /* __cilkrts_rethrow (...).  */
   CILK_TI_F_SAVE_FP,                  /* __cilkrts_save_fp_ctrl_state (...).  */
+  CILK_TI_F_LOOP_32,                  /* __cilkrts_cilk_for_32 (...).  */
+  CILK_TI_F_LOOP_64,                  /* __cilkrts_cilk_for_64 (...).  */
+
   /* __cilkrts_stack_frame struct fields.  */
   CILK_TI_FRAME_FLAGS,                /* stack_frame->flags.  */
   CILK_TI_FRAME_PARENT,               /* stack_frame->parent.  */
@@ -77,6 +80,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 #define cilk_rethrow_fndecl           cilk_trees[CILK_TI_F_RETHROW]
 #define cilk_pop_fndecl               cilk_trees[CILK_TI_F_POP]
 #define cilk_save_fp_fndecl           cilk_trees[CILK_TI_F_SAVE_FP]
+#define cilk_for_32_fndecl            cilk_trees[CILK_TI_F_LOOP_32]
+#define cilk_for_64_fndecl            cilk_trees[CILK_TI_F_LOOP_64]
 
 #define cilk_worker_type_fndecl       cilk_trees[CILK_TI_WORKER_TYPE]
 #define cilk_frame_type_decl          cilk_trees[CILK_TI_FRAME_TYPE]
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 2d1e1c7..1e7bebf 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1158,6 +1158,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_DISTRIBUTE:
 	  pp_string (buffer, "#pragma omp distribute");
 	  break;
+	case GF_OMP_FOR_KIND_CILKFOR:
+	  gcc_assert (flag_enable_cilkplus);
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -1167,7 +1170,11 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	  if (i)
 	    spc += 2;
 	  newline_and_indent (buffer, spc);
-	  pp_string (buffer, "for (");
+	  if (flag_enable_cilkplus 
+	      && gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
+	    pp_string (buffer, "_Cilk_for (");
+	  else
+	    pp_string (buffer, "for (");
 	  dump_generic_node (buffer, gimple_omp_for_index (gs, i), spc,
 			     flags, false);
 	  pp_string (buffer, " = ");
@@ -1192,6 +1199,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	    case GE_EXPR:
 	      pp_greater_equal (buffer);
 	      break;
+	    case NE_EXPR:
+	      pp_string (buffer, "!=");
+	      break;
 	    default:
 	      gcc_unreachable ();
 	    }
diff --git a/gcc/gimple.h b/gcc/gimple.h
index df92863..42304fd
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -91,13 +91,14 @@ enum gf_mask {
     GF_CALL_ALLOCA_FOR_VAR	= 1 << 5,
     GF_CALL_INTERNAL		= 1 << 6,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
-    GF_OMP_FOR_KIND_MASK	= 3 << 0,
+    GF_OMP_FOR_KIND_MASK	= 7 << 0,
     GF_OMP_FOR_KIND_FOR		= 0 << 0,
     GF_OMP_FOR_KIND_DISTRIBUTE	= 1 << 0,
     GF_OMP_FOR_KIND_SIMD	= 2 << 0,
     GF_OMP_FOR_KIND_CILKSIMD	= 3 << 0,
-    GF_OMP_FOR_COMBINED		= 1 << 2,
-    GF_OMP_FOR_COMBINED_INTO	= 1 << 3,
+    GF_OMP_FOR_KIND_CILKFOR     = 4 << 0,
+    GF_OMP_FOR_COMBINED		= 1 << 3,
+    GF_OMP_FOR_COMBINED_INTO	= 1 << 4,
     GF_OMP_TARGET_KIND_MASK	= 3 << 0,
     GF_OMP_TARGET_KIND_REGION	= 0 << 0,
     GF_OMP_TARGET_KIND_DATA	= 1 << 0,
@@ -523,6 +524,9 @@ struct GTY(()) gimple_omp_for_iter {
 
   /* Increment.  */
   tree incr;
+
+  /* Grain value, only used by _Cilk_for.  */
+  tree grain;
 };
 
 /* GIMPLE_OMP_FOR */
@@ -4562,6 +4566,37 @@ gimple_omp_for_set_pre_body (gimple gs, gimple_seq pre_body)
   omp_for_stmt->pre_body = pre_body;
 }
 
+/* Set GRAIN to be the grain value used by Cilk runtime for OMP_FOR GS.  */
+
+static inline void
+gimple_cilk_for_set_grain (tree grain, gimple gs)
+{
+  const gimple_statement_omp_for *omp_for_stmt =
+    as_a <gimple_statement_omp_for> (gs);
+  omp_for_stmt->iter[0].grain = grain;
+}
+
+/* Returns the induction variable of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_induction_var (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->index;
+}
+
+/* Returns the GRAIN value of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_grain (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->grain;
+}
 
 /* Return the clauses associated with OMP_PARALLEL GS.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index a6e0c75..09e4d33
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6559,7 +6559,19 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
   bool simd;
   bitmap has_decl_expr = NULL;
 
+  tree grain = NULL_TREE;
+  tree orig_init = NULL_TREE, orig_cond = NULL_TREE, orig_incr = NULL_TREE;
   orig_for_stmt = for_stmt = *expr_p;
+  
+  if (TREE_CODE (for_stmt) == CILK_FOR) 
+    { 
+      /* The user cannot pass any clauses for _Cilk_for,
+	 thus the grain value is stored in
+	 a safelen clause.  */
+      grain = OMP_FOR_CLAUSES (for_stmt);
+      grain = OMP_CLAUSE_SAFELEN_EXPR (grain);
+      OMP_FOR_CLAUSES (for_stmt) = NULL_TREE;
+    }
 
   simd = TREE_CODE (for_stmt) == OMP_SIMD
     || TREE_CODE (for_stmt) == CILK_SIMD;
@@ -6603,6 +6615,11 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     }
 
   for_body = NULL;
+  if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
+    {
+      tree it = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), 0);
+      gimplify_and_add (it, &for_pre_body);
+    }
   gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
 	      == TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
   gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
@@ -6677,7 +6694,12 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	}
       else
 	var = decl;
-
+ 
+      /* Original initial, final and increment values are necessary to compute
+	 the loop-count.  Otherwise, they are stored in variables and their
+	 context could be changed, potentially making it impossible to compute
+	 them correctly.  */
+      orig_init = TREE_OPERAND (t, 1);
       tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 			    is_gimple_val, fb_rvalue);
       ret = MIN (ret, tret);
@@ -6689,10 +6711,18 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gcc_assert (COMPARISON_CLASS_P (t));
       gcc_assert (TREE_OPERAND (t, 0) == decl);
 
-      tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-			    is_gimple_val, fb_rvalue);
-      ret = MIN (ret, tret);
-
+      if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
+	{
+	  int x = 1;
+	  orig_cond = TREE_OPERAND (t, 1);
+	  copy_tree_r (&orig_cond, &x, NULL);
+	}
+      else
+	{
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL, 
+				is_gimple_val, fb_rvalue);
+	  ret = MIN (ret, tret);
+	}
       /* Handle OMP_FOR_INCR.  */
       t = TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i);
       switch (TREE_CODE (t))
@@ -6713,6 +6743,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	    t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	    t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	    TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	    orig_incr = build_one_cst (TREE_TYPE (t));
 	    break;
 	  }
 
@@ -6726,6 +6757,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	  orig_incr = build_one_cst (TREE_TYPE (t));
 	  break;
 
 	case MODIFY_EXPR:
@@ -6753,8 +6785,16 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	      gcc_unreachable ();
 	    }
 
-	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-				is_gimple_val, fb_rvalue);
+	  orig_incr = TREE_OPERAND (t, 1);
+	  /* Right here we are just trying to extract the absolute
+	     value of the increment.  */
+	  if (TREE_CODE (t) == MINUS_EXPR
+	      || TREE_CODE  (TREE_OPERAND (t, 1)) == NEGATE_EXPR
+	      || (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST
+		  && tree_int_cst_sgn (TREE_OPERAND (t, 1)) < 1))
+	    orig_incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), orig_incr);
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body,
+				NULL, is_gimple_val, fb_rvalue);
 	  ret = MIN (ret, tret);
 	  if (c)
 	    {
@@ -6802,8 +6842,57 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 
   BITMAP_FREE (has_decl_expr);
 
+  tree incr_val = NULL_TREE, init_val = NULL_TREE, cond_val = NULL_TREE;
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      tree stmt_list = alloc_stmt_list ();
+      incr_val = create_tmp_var (TREE_TYPE (orig_incr), "__cilk_incr");
+      tree mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_incr), incr_val,
+			 orig_incr);
+      append_to_statement_list (mod, &stmt_list);
+
+      init_val = create_tmp_var (TREE_TYPE (orig_init), "__cilk_init");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_init), init_val, orig_init);
+      append_to_statement_list (mod, &stmt_list);
+
+      cond_val = create_tmp_var (TREE_TYPE (orig_cond), "__cilk_cond");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_cond), cond_val, orig_cond);
+      append_to_statement_list (mod, &stmt_list);
+  
+      gimplify_and_add (stmt_list, &for_pre_body);
+    }
   gimplify_and_add (OMP_FOR_BODY (orig_for_stmt), &for_body);
+ 
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      /* Sometimes an assign is inserted before the OMP_FOR_BODY.  So,
+	 search and find the omp for body.  */
+      gimple for_body_stmt = NULL;
+      for (gimple_stmt_iterator gsi = gsi_start (for_body); !gsi_end_p (gsi);
+	   gsi_next (&gsi))
+	{
+	  for_body_stmt = gsi_stmt (gsi);
+	  if (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL)
+	    break;
+	}
+      gcc_assert (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL);
+      tree orig_clses = gimple_omp_parallel_clauses (for_body_stmt);
+      tree new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = init_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = cond_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
 
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = incr_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      gimple_omp_parallel_set_clauses (for_body_stmt, new_clause);
+    }
   if (orig_for_stmt != for_stmt)
     for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
       {
@@ -6825,6 +6914,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break;
     case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break;
     case CILK_SIMD: kind = GF_OMP_FOR_KIND_CILKSIMD; break;
+    case CILK_FOR: kind = GF_OMP_FOR_KIND_CILKFOR; break;
     case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break;
     default:
       gcc_unreachable ();
@@ -6859,6 +6949,9 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gimple_omp_for_set_incr (gfor, i, TREE_OPERAND (t, 1));
     }
 
+  if (kind == GF_OMP_FOR_KIND_CILKFOR) 
+    gimple_cilk_for_set_grain (grain, gfor);
+
   gimplify_seq_add_stmt (pre_p, gfor);
   if (ret != GS_ALL_DONE)
     return GS_ERROR;
@@ -7880,6 +7973,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	case OMP_FOR:
 	case OMP_SIMD:
 	case CILK_SIMD:
+	case CILK_FOR:
 	case OMP_DISTRIBUTE:
 	  ret = gimplify_omp_for (expr_p, pre_p);
 	  break;
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index f1ec1c6..0beaa2a
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "tree-nested.h"
 #include "tree-eh.h"
+#include "cilk.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -198,6 +199,12 @@ struct omp_for_data
   struct omp_for_data_loop *loops;
 };
 
+/* A structure with necessary elements from _Cilk_for statement.  This
+   struct. node is passed in to WALK_STMT_INFO->INFO.  */
+typedef struct cilk_for_information {
+  bool found;
+  tree induction_var;
+} cilk_for_info;
 
 static splay_tree all_contexts;
 static int taskreg_nesting_level;
@@ -314,6 +321,9 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
   fd->have_ordered = false;
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
   fd->chunk_size = NULL_TREE;
+  if (flag_enable_cilkplus 
+      && gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
+    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
   collapse_iter = NULL;
   collapse_count = NULL;
 
@@ -391,8 +401,10 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	case GT_EXPR:
 	  break;
 	case NE_EXPR:
-	  gcc_assert (gimple_omp_for_kind (for_stmt)
-		      == GF_OMP_FOR_KIND_CILKSIMD);
+	  gcc_assert ((gimple_omp_for_kind (for_stmt)
+		       == GF_OMP_FOR_KIND_CILKSIMD)
+		      || (gimple_omp_for_kind (for_stmt)
+			  == GF_OMP_FOR_KIND_CILKFOR));
 	  break;
 	case LE_EXPR:
 	  if (POINTER_TYPE_P (TREE_TYPE (loop->n2)))
@@ -897,7 +909,31 @@ use_pointer_for_field (tree decl, omp_context *shared_ctx)
 	 variable no longer really shared.  */
       if (shared_ctx->is_nested)
 	{
-	  omp_context *up;
+	  omp_context *up = shared_ctx->outer;
+
+	  /* If VAR is the induction variable of the outer _Cilk_for, then
+	     it needs to be passed as a value not pointer since it
+	     would not be overwritten by the body.  */
+	  if (flag_enable_cilkplus
+	      && gimple_code (up->stmt) == GIMPLE_OMP_FOR
+	      && gimple_omp_for_kind (up->stmt) == GF_OMP_FOR_KIND_CILKFOR) 
+	    while (up) 
+	      { 
+		if (gimple_code (up->stmt) == GIMPLE_OMP_FOR
+		    && gimple_omp_for_kind (up->stmt)
+		    == GF_OMP_FOR_KIND_CILKFOR)
+		  {
+		    struct omp_for_data fd;
+		    /* _Cilk_for always has collapse = 1.  */
+		    struct omp_for_data_loop *loops
+		      = (struct omp_for_data_loop *)
+		      alloca (sizeof (struct omp_for_data_loop));
+		    extract_omp_for_data (up->stmt, &fd, loops);
+		    if (DECL_NAME (decl) == DECL_NAME (fd.loop.v))
+		      return false;
+		  }
+		up = up->outer;
+	      }
 
 	  for (up = shared_ctx->outer; up; up = up->outer)
 	    if (is_taskreg_ctx (up) && maybe_lookup_decl (decl, up))
@@ -1818,27 +1854,112 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	scan_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx);
 }
 
-/* Create a new name for omp child function.  Returns an identifier.  */
+/* Create a new name for omp child function.  Returns an identifier.  If 
+   IS_CILK_FOR is true then the suffix for the child function is 
+   "_cilk_for_fn."  */
 
 static tree
-create_omp_child_function_name (bool task_copy)
+create_omp_child_function_name (bool task_copy, bool is_cilk_for)
 {
+  if (is_cilk_for)
+    return clone_function_name (current_function_decl, "_cilk_for_fn");
   return (clone_function_name (current_function_decl,
 			       task_copy ? "_omp_cpyfn" : "_omp_fn"));
 }
 
+/* Helper function for walk_gimple_seq function.  *GSI_P is the gimple stmt.
+   iterator passed by walk_gimple_seq and *WI->INFO holds the CILK_FOR_INFO
+   structure.  This function sets the values inside this structure if it
+   finds a _Cilk_for statement in *GSI_P.  HANDLED_OPS_P is unused.  */
+
+static tree
+find_cilk_for_stmt (gimple_stmt_iterator *gsi_p,
+		    bool *handled_ops_p ATTRIBUTE_UNUSED,
+		    struct walk_stmt_info *wi)
+{
+  cilk_for_info *cf_info = (cilk_for_info *) wi->info;
+  gimple stmt = gsi_stmt (*gsi_p);
+
+  if (gimple_code (stmt) == GIMPLE_OMP_FOR
+      && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_CILKFOR)
+      /* For nested _Cilk_for statments, just look into the
+	 outer-most one.  */
+      && cf_info->found == false)
+    {
+      cf_info->found = true;
+      cf_info->induction_var = gimple_cilk_for_induction_var (stmt);
+    }
+  return NULL_TREE;
+}
+
+/* Returns true if STMT contains a CILK_FOR statement.  If found then
+   set *IND_VAR with induction variable.  Otherwise these values remain 
+   untouched.  IND_VAR can be NULL and if so then it is left untouched.  */
+
+static bool
+is_cilk_for_stmt (gimple stmt, tree *ind_var)
+{
+  if (!flag_enable_cilkplus)
+    return false;
+    
+  gimple_seq body = stmt;
+  struct walk_stmt_info wi;
+  cilk_for_info cf_info;
+  memset (&cf_info, 0, sizeof (cilk_for_info));
+  memset (&wi, 0, sizeof (wi));
+  wi.info = &cf_info;
+  walk_gimple_seq (body, find_cilk_for_stmt, NULL, &wi);
+  if (cf_info.found)
+    {
+      if (ind_var)
+	*ind_var = cf_info.induction_var;
+      return true;
+    }
+    
+  return false;
+}
+
+/* Returns the type of the induction variable for the child function for
+   _Cilk_for and the types for _high and _low variables based on TYPE.  */
+
+static tree
+cilk_for_check_loop_diff_type (tree type)
+{
+  if (type == integer_type_node)
+    return type;
+  else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node))
+    { 
+      if (TYPE_UNSIGNED (type)) 
+	return uint32_type_node;
+      else
+	return integer_type_node;
+    }
+  else
+    {
+      if (TYPE_UNSIGNED (type)) 
+	return uint64_type_node;
+      else
+	return long_long_integer_type_node;
+    }
+  gcc_unreachable ();
+}
+
 /* Build a decl for the omp child function.  It'll not contain a body
    yet, just the bare decl.  */
 
 static void
-create_omp_child_function (omp_context *ctx, bool task_copy)
+create_omp_child_function (omp_context *ctx, bool task_copy,
+			   bool is_cilk_for, tree cilk_var_type)
 {
   tree decl, type, name, t;
-
-  name = create_omp_child_function_name (task_copy);
+ 
+  name = create_omp_child_function_name (task_copy, is_cilk_for);
   if (task_copy)
     type = build_function_type_list (void_type_node, ptr_type_node,
 				     ptr_type_node, NULL_TREE);
+  else if (is_cilk_for)
+    type = build_function_type_list (void_type_node, ptr_type_node,
+				     cilk_var_type, cilk_var_type, NULL_TREE);
   else
     type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
 
@@ -1888,6 +2009,33 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_CONTEXT (t) = decl;
   DECL_RESULT (decl) = t;
 
+  /* _Cilk_for's child function requires two extra parameters called 
+     __low and __high that are set the by Cilk runtime when it calls this 
+     function.  */
+  if (is_cilk_for)
+    {
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__high"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__low"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+    }
   t = build_decl (DECL_SOURCE_LOCATION (decl),
 		  PARM_DECL, get_identifier (".omp_data_i"), ptr_type_node);
   DECL_ARTIFICIAL (t) = 1;
@@ -1895,6 +2043,8 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_ARG_TYPE (t) = ptr_type_node;
   DECL_CONTEXT (t) = current_function_decl;
   TREE_USED (t) = 1;
+  if (is_cilk_for)
+    DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
   DECL_ARGUMENTS (decl) = t;
   if (!task_copy)
     ctx->receiver_decl = t;
@@ -2016,7 +2166,15 @@ scan_omp_parallel (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+
+  tree ind_var = NULL_TREE;
+  bool is_cilk_for = (flag_enable_cilkplus && outer_ctx
+		      && is_cilk_for_stmt (outer_ctx->stmt, &ind_var));
+  tree cilk_var_type =
+    (is_cilk_for ? cilk_for_check_loop_diff_type (TREE_TYPE (ind_var))
+     : NULL_TREE);
+
+  create_omp_child_function (ctx, false, is_cilk_for, cilk_var_type);
   gimple_omp_parallel_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_parallel_clauses (stmt), ctx);
@@ -2061,7 +2219,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+  create_omp_child_function (ctx, false, false, NULL_TREE);
   gimple_omp_task_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_task_clauses (stmt), ctx);
@@ -2074,7 +2232,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
       DECL_ARTIFICIAL (name) = 1;
       DECL_NAMELESS (name) = 1;
       TYPE_NAME (ctx->srecord_type) = name;
-      create_omp_child_function (ctx, true);
+      create_omp_child_function (ctx, true, false, NULL_TREE);
     }
 
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
@@ -2199,7 +2357,7 @@ scan_omp_target (gimple stmt, omp_context *outer_ctx)
   TYPE_NAME (ctx->record_type) = name;
   if (kind == GF_OMP_TARGET_KIND_REGION)
     {
-      create_omp_child_function (ctx, false);
+      create_omp_child_function (ctx, false, false, NULL_TREE);
       gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
     }
 
@@ -2993,6 +3151,15 @@ lower_rec_simd_input_clauses (tree new_var, omp_context *ctx, int &max_vf,
   return true;
 }
 
+/* Returns true if the variable name in DECL matches *NAME.  */
+
+static inline bool
+is_cilk_loop_var (tree decl, const char *name)
+{
+  return (DECL_NAME (decl) && !strncmp (IDENTIFIER_POINTER (DECL_NAME (decl)), 
+					name, strlen (name))); 
+}
+
 /* Generate code to implement the input clauses, FIRSTPRIVATE and COPYIN,
    from the receiver (aka child) side and initializers for REFERENCE_TYPE
    private variables.  Initialization statements go in ILIST, while calls
@@ -3245,6 +3412,18 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 	      SET_DECL_VALUE_EXPR (new_var, x);
 	      DECL_HAS_VALUE_EXPR_P (new_var) = 1;
 
+	      /* In _Cilk_for, the increment, start and final values
+		 are stored in the clause inserted by gimplify_omp_for.  
+		 This value is used by the child function to find the 
+		 appropriate induction value function based on the 
+		 high and low parameters of the child function.  
+		 Now, we need to store the decl value expressions here so 
+		 that we can easily access them.  */
+	      if (flag_enable_cilkplus 
+		  && (is_cilk_loop_var (var, "__cilk_init") 
+		      || is_cilk_loop_var (var, "__cilk_cond")
+		      || is_cilk_loop_var (var, "__cilk_incr"))) 
+		SET_DECL_VALUE_EXPR (var, x);
 	      /* ??? If VAR is not passed by reference, and the variable
 		 hasn't been initialized yet, then we'll get a warning for
 		 the store into the omp_data_s structure.  Ideally, we'd be
@@ -4628,6 +4807,250 @@ expand_omp_build_assign (gimple_stmt_iterator *gsi_p, tree to, tree from)
     }
 }
 
+/* Returns true if T is a tree whose code is COMPONENT_REF and its field
+   matches D_F_NAME and the data argument matches D_ARG_NAME.  */
+
+static bool
+cilk_find_field_value (tree t, tree d_arg_name, tree d_f_name)
+{
+  if (TREE_CODE (t) == COMPONENT_REF)
+    {
+      tree arg = TREE_OPERAND (t, 0);
+      tree field = TREE_OPERAND (t, 1);
+      if (TREE_CODE (arg) == ADDR_EXPR || TREE_CODE (arg) == MEM_REF)
+	arg = TREE_OPERAND (arg, 0);
+      if (DECL_NAME (arg) && DECL_NAME (field)
+	  && !strcmp (IDENTIFIER_POINTER (d_arg_name),
+		      IDENTIFIER_POINTER (DECL_NAME (arg)))
+	  && !strcmp (IDENTIFIER_POINTER (d_f_name),
+		      IDENTIFIER_POINTER (DECL_NAME (field)))) 
+	return true;
+    }
+  return false;
+}
+
+/* Find the COMPONENT_REF in all the basic blocks in REGION whose 
+   data-argument is DATA_ARG and field is FIELD and then replace that 
+   COMPONENT_REF value with NEW_VALUE, a VAR_DECL.  */
+
+static void
+cilk_for_find_component_expr (struct omp_region *region, tree data_arg,
+			      tree field, tree new_value)
+{
+  vec<basic_block> bbs;
+  basic_block bb;
+  unsigned ii;
+  tree new_val = NULL_TREE;
+  bbs.create (0);
+  gather_blocks_in_sese_region (region->entry, region->exit, &bbs);
+  /* No need to push the entry bb into BBS since it doesn't get inserted
+     into the child function.  */
+  
+  tree da_name = DECL_NAME (data_arg);
+  tree df_name = DECL_NAME (field);
+  FOR_EACH_VEC_ELT (bbs, ii, bb)    
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+	 gsi_next (&gsi))
+      {
+	gimple stmt = gsi_stmt (gsi);
+	if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	  for (unsigned jj = 1; jj < gimple_num_ops (stmt); jj++)
+	    {
+	      tree *op = gimple_op_ptr (stmt, jj);
+	      if (TREE_CODE (*op) == COMPONENT_REF
+		  && cilk_find_field_value (*op, da_name, df_name))
+		{    
+		  if (TREE_TYPE (*op) == TREE_TYPE (new_value))
+		    new_val = new_value;
+		  else
+		    {
+		      tree t = fold_convert (TREE_TYPE (*op), new_value);
+		      new_val =
+			force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+						  true, GSI_NEW_STMT);
+		    }
+		  gsi_insert_before (&gsi, gimple_build_assign (*op, new_val), 
+				     GSI_NEW_STMT);
+		  *op = new_val;
+		}
+	    }
+      }
+}
+
+/* Find the initial, final and increment values in BODY_STMT's clause
+   and store them in *INIT, *FINAL and *INCR parameters respectively.  */
+
+static void
+find_cilk_for_vars (gimple body_stmt, tree *init, tree *final, tree *incr)
+{
+  /* Initial, final and increment value all start with __cilk_init,
+     __cilk_cond and __cilk_incr, respectively.  These values are defined
+     in shared clause.  Thus, we search for those.  */
+  for (tree cc = gimple_omp_parallel_clauses (body_stmt); cc; 
+       cc = OMP_CLAUSE_CHAIN (cc))
+    if (OMP_CLAUSE_CODE (cc) == OMP_CLAUSE_SHARED)
+      {
+	tree decl = OMP_CLAUSE_DECL (cc);
+	if (is_cilk_loop_var (decl, "__cilk_incr"))
+	  { 
+	    *incr = decl;
+	    if (DECL_VALUE_EXPR (*incr))
+	      *incr = DECL_VALUE_EXPR (*incr);
+	  } 
+	else if (is_cilk_loop_var (decl, "__cilk_init"))
+	  { 
+	    *init = decl;
+	    if (DECL_VALUE_EXPR (*init))
+	      *init = DECL_VALUE_EXPR (*init);
+	  }
+	else if (is_cilk_loop_var (decl, "__cilk_cond"))
+	  { 
+	    *final = decl;
+	    if (DECL_VALUE_EXPR (*final))
+	      *final = DECL_VALUE_EXPR (*final);
+	  }
+      }
+}
+ 
+/* Expand the _Cilk_for body starting at REGION.  DATA_ARG, HIGH and LOW 
+   indicates data-argument, __high and __low parameters of the child 
+   function.  */
+
+static void
+expand_cilk_for_body (struct omp_region *region, tree data_arg,
+		      tree low, tree high)
+{
+  struct omp_for_data fd;
+  struct omp_for_data_loop *loops;
+  loops
+    = (struct omp_for_data_loop *)
+      alloca (gimple_omp_for_collapse (last_stmt (region->outer->entry))
+	      * sizeof (struct omp_for_data_loop));
+  extract_omp_for_data (last_stmt (region->outer->entry), &fd, loops);
+  region->sched_kind = fd.sched_kind;
+  basic_block entry_bb = region->entry;
+  
+  /* This is where the body is and the location where we must insert
+     the modification to the induction variable.  */
+  basic_block body_bb = single_succ (region->entry);
+  gimple entry_stmt = last_stmt (region->entry);
+  
+  /* Split the first basic block into two and put the initializer values
+     in the top one.  */
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  basic_block l1_bb = split_block (entry_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (l1_bb);
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd.loop.v));
+  tree ind_var = create_tmp_reg (type, "__cilk_ind_var");
+  tree t = fold_convert (type, low);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_NEW_STMT);
+  gimple stmt = gimple_build_assign (ind_var, fold_convert (type, t));
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  vec_alloc (region->ws_args, 2);
+  tree t1 = null_pointer_node;
+  tree t2 = build_fold_addr_expr (gimple_omp_parallel_child_fn (entry_stmt));
+  if (data_arg)
+    {
+      t1 = build_fold_addr_expr (gimple_omp_parallel_data_arg (entry_stmt));
+      gsi = gsi_start_bb (body_bb);
+      tree init = NULL_TREE, final_val = NULL_TREE, incr = NULL_TREE;
+      find_cilk_for_vars (entry_stmt, &init, &final_val, &incr);
+
+      tree step = fd.loop.step;
+      if (TREE_CODE (fd.loop.step) != INTEGER_CST)
+	step = incr;      
+      step = fold_convert (type, step);
+      if (TREE_CODE (step) == INTEGER_CST && tree_int_cst_sgn (step) < 1)
+	step = fold_build1_loc (UNKNOWN_LOCATION, NEGATE_EXPR, type, step);
+      
+      tree tmp = create_tmp_reg (type, NULL);
+      gsi_insert_before (&gsi, gimple_build_assign (tmp, step),
+			 GSI_NEW_STMT);
+      t = build2 (MULT_EXPR, type, ind_var, tmp);
+      tree tmp2 = create_tmp_reg (type, NULL);
+      gsi_insert_after (&gsi, gimple_build_assign (tmp2, t), GSI_NEW_STMT);
+
+      tmp = create_tmp_reg (type, NULL);
+      init = fold_convert (type, init);
+      tree init_tmp = force_gimple_operand_gsi
+	(&gsi, init, true, NULL_TREE, false, GSI_CONTINUE_LINKING); 
+
+      gsi_insert_after (&gsi, gimple_build_assign (tmp, init_tmp), 
+			GSI_NEW_STMT);
+      if (fd.loop.cond_code == GE_EXPR || fd.loop.cond_code == GT_EXPR) 
+	t = fold_build2 (MINUS_EXPR, type, tmp, tmp2);
+      else 
+	t = fold_build2 (PLUS_EXPR, type, tmp, tmp2);
+
+      t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false, 
+				    GSI_CONTINUE_LINKING);
+      tree tmp3 = create_tmp_reg (type, NULL);
+      gimple stmt = gimple_build_assign (tmp3, t);
+      gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+      cilk_for_find_component_expr (region, data_arg, fd.loop.v, tmp3);
+    }
+  region->ws_args->quick_push (t1);
+  region->ws_args->quick_push (t2);
+  
+  gsi = gsi_last_bb (l1_bb);
+  basic_block cond_bb = split_block (l1_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (l1_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (cond_bb);
+  t = fold_convert (type, high);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_CONTINUE_LINKING);
+  t = build2 (LT_EXPR, boolean_type_node, ind_var, t);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p,
+		 NULL, NULL)
+      || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p,
+		    NULL, NULL))
+    {
+      gsi = gsi_for_stmt (stmt);
+      gimple_regimplify_operands (stmt, &gsi);
+    }
+
+  /* Insert incrementing of induction variable.  */
+  gsi = gsi_last_bb (body_bb);
+  t = build2 (PLUS_EXPR, type, ind_var, build_one_cst (type));
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+  gsi_insert_after (&gsi, gimple_build_assign (ind_var, t),
+		    GSI_CONTINUE_LINKING);
+  
+  basic_block exit_bb = region->exit;
+
+  gsi = gsi_last_bb (exit_bb);
+  basic_block last_bb = split_block (exit_bb, gsi_stmt (gsi))->dest;
+  
+  /* Remove the #pragma omp return.  */
+  gsi = gsi_last_bb (exit_bb);
+  gsi_remove (&gsi, true);
+  
+  gsi = gsi_last_bb (last_bb);
+  gsi_insert_before (&gsi, gimple_build_return (NULL), GSI_SAME_STMT);
+  
+  /* Now connect all the basic-blocks.  */
+  edge e = make_edge (cond_bb, last_bb, EDGE_FALSE_VALUE);
+  e->probability = REG_BR_PROB_BASE / 4;
+
+  edge e3 = find_edge (cond_bb, body_bb);
+  e3->probability = REG_BR_PROB_BASE * 3 / 4;
+  e3->flags = EDGE_TRUE_VALUE;
+  
+  edge e2 = find_edge (exit_bb, last_bb);
+  remove_edge (e2);
+  e2 = make_edge (exit_bb, cond_bb, EDGE_FALLTHRU);
+  e2->probability = 1;
+  region->exit = last_bb;
+}
+
 /* Expand the OpenMP parallel or task directive starting at REGION.  */
 
 static void
@@ -4640,6 +5063,7 @@ expand_omp_taskreg (struct omp_region *region)
   gimple entry_stmt, stmt;
   edge e;
   vec<tree, va_gc> *ws_args;
+  gimple parcopy_stmt = NULL;
 
   entry_stmt = last_stmt (region->entry);
   child_fn = gimple_omp_taskreg_child_fn (entry_stmt);
@@ -4648,6 +5072,16 @@ expand_omp_taskreg (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
+  /* The way _Cilk_for is constructed in the compiler is like making
+     the _Cilk_for statment a #pragma OMP for and the body of it is
+     enclosed in #pragma omp parallel.  In this routine, we handle
+     inserting the body into the child function and putting a loop around
+     it to go from low to high.  NOTE: Even though this is how the 
+     compiler breaks them, they do NOT function the same way.  */
+  bool is_cilk_for =
+    (flag_enable_cilkplus && region->outer
+     && is_cilk_for_stmt (last_stmt (region->outer->entry), NULL));
+    
   if (is_combined_parallel (region))
     ws_args = region->ws_args;
   else
@@ -4698,7 +5132,6 @@ expand_omp_taskreg (struct omp_region *region)
 	  basic_block entry_succ_bb = single_succ (entry_bb);
 	  gimple_stmt_iterator gsi;
 	  tree arg, narg;
-	  gimple parcopy_stmt = NULL;
 
 	  for (gsi = gsi_start_bb (entry_succ_bb); ; gsi_next (&gsi))
 	    {
@@ -4755,6 +5188,29 @@ expand_omp_taskreg (struct omp_region *region)
 	    }
 	}
 
+      /* Extract the __high and __low parameter from the function.  */
+      tree high_arg = NULL_TREE, low_arg = NULL_TREE;
+      if (is_cilk_for)
+	{
+	  for (tree ii_arg = DECL_ARGUMENTS (child_fn); ii_arg != NULL_TREE;
+	       ii_arg = TREE_CHAIN (ii_arg))
+	    {
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)),
+			   "__high"))
+		high_arg = ii_arg;
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__low"))
+		low_arg = ii_arg;
+	    }
+	  gcc_assert (high_arg);
+	  gcc_assert (low_arg);
+	  expand_cilk_for_body (region, gimple_get_lhs (parcopy_stmt),
+				low_arg, high_arg);
+
+	  /* A new BB is added to the end of EXIT_BB and thus it needs to be
+	     updated.  */
+	  exit_bb = region->exit;
+	}
+
       /* Declare local variables needed in CHILD_CFUN.  */
       block = DECL_INITIAL (child_fn);
       BLOCK_VARS (block) = vec2chain (child_cfun->local_decls);
@@ -4787,7 +5243,7 @@ expand_omp_taskreg (struct omp_region *region)
       single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
 
       /* Convert GIMPLE_OMP_RETURN into a RETURN_EXPR.  */
-      if (exit_bb)
+      if (exit_bb && !is_cilk_for)
 	{
 	  gsi = gsi_last_bb (exit_bb);
 	  gcc_assert (!gsi_end_p (gsi)
@@ -4861,11 +5317,16 @@ expand_omp_taskreg (struct omp_region *region)
       pop_cfun ();
     }
 
-  /* Emit a library call to launch the children threads.  */
-  if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
-    expand_parallel_call (region, new_bb, entry_stmt, ws_args);
-  else
-    expand_task_call (new_bb, entry_stmt);
+  /* In _Cilk_for, the call to the runtime function is inserted by
+     expand_omp_for.  */
+  if (!is_cilk_for)
+    {
+      /* Emit a library call to launch the children threads.  */
+      if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
+	expand_parallel_call (region, new_bb, entry_stmt, ws_args);
+      else
+	expand_task_call (new_bb, entry_stmt);
+    }
   if (gimple_in_ssa_p (cfun))
     update_ssa (TODO_update_ssa_only_virtuals);
 }
@@ -6540,6 +7001,122 @@ expand_omp_for_static_chunk (struct omp_region *region,
     }
 }
 
+/* A subroutine of expand_omp_for.  Insert the function call to the
+   cilk library function-call: __cilkrts_cilk_for_64/32 into the end of
+   REGION.  Loop information is calculated using step, n1 and n2 from FD.  */
+
+static void
+insert_cilk_for_fn_call (struct omp_region *region, struct omp_for_data *fd)
+{
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  basic_block entry_bb = region->entry;
+  bool broken_loop = region->cont == NULL;
+  basic_block cont_bb = region->cont;
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  tree diff_type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  tree grain = gimple_cilk_for_grain (fd->for_stmt);
+  
+  /* Convert n2 and n1 to the type we need.  */
+  tree n1 = fold_convert (diff_type, fd->loop.n1);
+  tree n2 = fold_convert (diff_type, fd->loop.n2);
+
+  n1 = force_gimple_operand_gsi (&gsi, n1, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  n2 = force_gimple_operand_gsi (&gsi, n2, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  tree diff_val = fold_build2 (MINUS_EXPR, diff_type, n2, n1);
+
+  diff_val = force_gimple_operand_gsi (&gsi, diff_val, true, NULL_TREE,
+					    true, GSI_SAME_STMT);
+  tree step = fd->loop.step;
+  tree step_convert = force_gimple_operand_gsi (&gsi,
+						fold_convert (diff_type, step),
+						true, NULL_TREE, true,
+						GSI_SAME_STMT);
+  tree count = fold_build2 (TRUNC_DIV_EXPR, diff_type, diff_val, step_convert);
+  count = force_gimple_operand_gsi (&gsi, count, true, NULL_TREE, true,
+				    GSI_SAME_STMT);
+
+  tree data_arg_ptr = (*region->ws_args)[0];
+  tree child_fn = (*region->ws_args)[1];
+
+  tree lib_fun = NULL_TREE;
+  if (TYPE_PRECISION (type) == 32)
+    lib_fun = cilk_for_32_fndecl;
+  else if (TYPE_PRECISION (type) == 64)
+    lib_fun = cilk_for_64_fndecl;
+  else
+    gcc_unreachable ();
+
+  vec<tree, va_gc> *args;
+  vec_alloc (args, 4);
+  args->quick_push (child_fn);
+  args->quick_push (data_arg_ptr);
+  args->quick_push (count);
+  args->quick_push (grain);
+  tree t = build_call_expr_loc_vec (UNKNOWN_LOCATION, lib_fun, args);
+  gsi_remove (&gsi, true);
+
+  if (!broken_loop)
+    {
+      /* Code to control the increment goes in the CONT_BB.  */
+      gsi = gsi_last_bb (cont_bb);
+      gimple stmt = gsi_stmt (gsi);
+      gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE);
+      gsi_remove (&gsi, true);
+      
+      /* remove the edge to OMP continue block.  */
+      unsigned int ii = 0;
+      while (EDGE_COUNT (cont_bb->succs) > 1)
+	{
+	  edge ee = EDGE_SUCC (cont_bb, ii);
+	  if (!(ee->flags & EDGE_FALLTHRU))
+	    remove_edge (ee);
+	  ii++;
+	}      
+      gsi = gsi_start_bb (cont_bb);
+      gsi_remove (&gsi, true);
+      force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+    }
+
+  /* Remove GIMPLE_OMP_RETURN.  */
+  gsi = gsi_last_bb (region->exit);
+  gimple stmt = gsi_stmt (gsi);
+  gcc_assert (gimple_code (stmt) == GIMPLE_OMP_RETURN);
+  gsi_remove (&gsi, true);
+
+  gsi = gsi_last_bb (region->entry);
+  t = fold_build2 (fd->loop.cond_code, boolean_type_node, n1, n2);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  /* In here we are replacing a _Cilk_for statement with something
+     like this:
+
+     if (n1 <cond_code> n2)
+       goto bb1
+     else
+       goto bb2
+     
+     bb1:
+       .omp_data.o.__cilk_incr = __cilk_incr;
+       ...
+       __cilkrts_cilk_for_{32/64} (func_name, &omp_data_0, <count>, <grain>);
+
+     bb2:
+     clobber all values and go out.  */  
+  unsigned int ii = 0;
+  while (ii < EDGE_COUNT (region->entry->succs))
+    {
+      edge ee = EDGE_SUCC (region->entry, ii);
+      if (ee->flags & EDGE_FALLTHRU)
+	ee->flags = EDGE_TRUE_VALUE;
+      else
+	ee->flags = EDGE_FALSE_VALUE;
+      ii++;
+    }
+}
 
 /* A subroutine of expand_omp_for.  Generate code for a simd non-worksharing
    loop.  Given parameters:
@@ -6880,6 +7457,12 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt)
 
   if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_SIMD)
     expand_omp_simd (region, &fd);
+  else if (flag_enable_cilkplus 
+	   && (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_CILKFOR))
+    {
+      region->ws_args = region->inner->ws_args;
+      insert_cilk_for_fn_call (region, &fd);
+    }
   else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
 	   && !fd.have_ordered)
     {
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
new file mode 100644
index 0000000..a80f413
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
@@ -0,0 +1,100 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+static void check (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start;  ii < end; ii = ii + incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+static void check_reverse (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start; ii >= end; ii = ii - incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+
+int main (void)
+{
+  int Array[10];
+  int x = 9, y = 0, z = 3;
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 1133;
+  check (Array, 0, 10, 1, 1133);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 9; ii > -1; ii--)
+    Array[ii] = 4433;
+  check_reverse (Array, 9, 0, 1, 4433);
+
+  _Cilk_for (int ii = 9; ii > -1; --ii)
+    Array[ii] = 9988;
+  check_reverse (Array, 9, 0, 1, 9988);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    Array[ii] = 1328;
+  check (Array, 0, 10, 2, 1328);
+
+  _Cilk_for (int ii = 9; ii >= 0; ii -= 2)
+    Array[ii] = 1738;
+  check_reverse (Array, 9, 0, 2, 1738);
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if (ii % 2)
+	Array[ii] = 1343;
+      else
+	Array[ii] = 3413;
+    }
+
+  check (Array, 1, 10, 2, 1343); 
+  check (Array, 0, 10, 2, 3413); 
+
+  _Cilk_for (short cc = 0; cc < 10; cc++) 
+    Array[cc] = 1343;
+  check (Array, 0, 10,  1,1343);
+
+  _Cilk_for (short cc = 9; cc >= 0; cc--)
+    Array[cc] = 1348;
+  check_reverse (Array, 9, 0, 1, 1348);
+
+
+
+  /* Loop with polynomials in _Cilk_for.  */
+  _Cilk_for (int ii = z - 3; ii <= z * 3; ii += 2)
+    { 
+      Array[ii] = 3233;
+    }
+
+  for (int ii = z-3; ii <= z*3; ii += 2)
+    if (Array[ii] != 3233)
+      __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
new file mode 100644
index 0000000..0ebc09a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+int main (void)
+{
+  int q = 0, ii = 0, jj = 0;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+
+  _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 5;
+
+  _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
+    q = 5;
+
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static variable" } */
+    q = 5;
+
+  _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */
+    q = 5;
+
+  _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
+    q = 5;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
new file mode 100644
index 0000000..6cb9b03
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
@@ -0,0 +1,35 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int grain_value = 2;
+int main (void)
+{
+  int Array1[200], Array1_Serial[200];
+
+  for (int ii = 0; ii < 200; ii++)
+    {
+      Array1_Serial[ii] = 2;
+      Array1[ii] = 1;
+    }
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 200; ii++)
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+#pragma cilk grainsize = grain_value
+  _Cilk_for (int ii = 0; ii < 200; ii++) 
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
new file mode 100644
index 0000000..ff8bc0a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+char Array1[26];
+
+#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */
+
+int main(int argc, char **argv)
+{
+/* This is OK.  */
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+
+/* This is OK, it will do a type conversion to long int.  */
+#pragma cilk grainsize = 0.5 
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize = 1 
+  while (Array1[5] != 0) /* { dg-warning "grainsize pragma is not followed" } */
+    {
+    /* Blah */
+    }
+
+#pragma cilk grainsize = 1 
+  int q = 0; /* { dg-warning "grainsize pragma is not followed" } */
+  _Cilk_for (q = 0; q < 10; q++)
+    Array1[q]  = 5;
+
+  while (Array1[5] != 0)
+    {
+    /* Blah */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
new file mode 100644
index 0000000..7a779f7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
@@ -0,0 +1,41 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+/* <feature> loop control variable must have integer, pointer or class type
+   </feature>
+*/
+
+#define ARRAY_SIZE 10000
+int a[ARRAY_SIZE];
+
+int main(void)
+{ 
+  int ii = 0;
+
+#if 1
+  for (ii =0; ii < ARRAY_SIZE; ii++)
+    a[ii] = 5;
+#endif
+  _Cilk_for(int *aa = a; aa < a + ARRAY_SIZE; aa++) 
+    *aa = 0;
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii++) 
+    if (a[ii] != 0) 
+      __builtin_abort ();
+#endif
+
+  _Cilk_for (int *aa = a; aa < a + ARRAY_SIZE; aa = aa + 2)
+    *aa = 4;
+
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii = ii + 2) 
+    if (a[ii] != 4) 
+      __builtin_abort ();
+#endif
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
new file mode 100644
index 0000000..cffe17e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
@@ -0,0 +1,79 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+int main (void)
+{
+  int Array[10][10];
+
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj++)
+	{
+	  Array[ii][jj] = 0;
+	}
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 5; jj++)
+      Array[ii][jj] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 5; jj++)
+      if (Array[ii][jj] != 5)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+
+  /* One goes up and one goes down.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 9; jj >= 0; jj--)
+      Array[ii][jj] = 7;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 9; jj >= 0; jj--)
+      if (Array[ii][jj] != 7)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 10; jj += 2)
+      Array[ii][jj] = 9;
+  
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj += 2)
+      if (Array[ii][jj] != 9)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    _Cilk_for (int jj = 5; jj < 9; jj++)
+      Array[ii][jj] = 11; 
+  
+  for (int ii = 0; ii < 10; ii += 2)
+    for (int jj = 5; jj < 9; jj++)
+      if (Array[ii][jj] != 11)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 0a41b86..988408a 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -351,6 +351,7 @@ enum omp_clause_schedule_kind {
   OMP_CLAUSE_SCHEDULE_GUIDED,
   OMP_CLAUSE_SCHEDULE_AUTO,
   OMP_CLAUSE_SCHEDULE_RUNTIME,
+  OMP_CLAUSE_SCHEDULE_CILKFOR,
   OMP_CLAUSE_SCHEDULE_LAST
 };
 
diff --git a/gcc/tree.def b/gcc/tree.def
index f8d6444..558d7c8 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1051,6 +1051,10 @@ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6)
    Operands like for OMP_FOR.  */
 DEFTREECODE (CILK_SIMD, "cilk_simd", tcc_statement, 6)
 
+/* Cilk Plus - _Cilk_for (..)
+   Operands like for OMP_FOR.  */
+DEFTREECODE (CILK_FOR, "cilk_for", tcc_statement, 6)
+
 /* OpenMP - #pragma omp distribute [clause1 ... clauseN]
    Operands like for OMP_FOR.  */
 DEFTREECODE (OMP_DISTRIBUTE, "omp_distribute", tcc_statement, 6)

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2014-01-07 22:12                                 ` Iyer, Balaji V
@ 2014-01-08 17:31                                   ` Jakub Jelinek
  2014-01-08 19:46                                     ` Iyer, Balaji V
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Jelinek @ 2014-01-08 17:31 UTC (permalink / raw)
  To: Iyer, Balaji V
  Cc: Jason Merrill, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

On Tue, Jan 07, 2014 at 10:11:59PM +0000, Iyer, Balaji V wrote:
> 	I used a similar existing one (safelen). Attached, please find 2
> fixed patches for C and C++ along with their changelogs.

But safelen is something completely different, while if I skim
the _Cilk_for docs, the grain is really a chunk size, where the runtime
library performs the scheduling of grain sized chunks, so using
OMP_CLAUSE_SCHEDULE clause with
OMP_CLAUSE_SCHEDULE_KIND (c) = OMP_CLAUSE_SCHEDULE_RUNTIME;
OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (c) = grain_expr;
sounds like what should be used.  OMP_CLAUSE_SAFELEN says what is the
minimal vectorization factor the compiler can assume is safe for
a simd loop.

	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2014-01-08 17:31                                   ` Jakub Jelinek
@ 2014-01-08 19:46                                     ` Iyer, Balaji V
  2014-01-16 17:29                                       ` Jason Merrill
  2014-01-16 21:19                                       ` Aldy Hernandez
  0 siblings, 2 replies; 42+ messages in thread
From: Iyer, Balaji V @ 2014-01-08 19:46 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

[-- Attachment #1: Type: text/plain, Size: 1175 bytes --]



> -----Original Message-----
> From: Jakub Jelinek [mailto:jakub@redhat.com]
> Sent: Wednesday, January 8, 2014 12:31 PM
> To: Iyer, Balaji V
> Cc: Jason Merrill; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'rth@redhat.com'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On Tue, Jan 07, 2014 at 10:11:59PM +0000, Iyer, Balaji V wrote:
> > 	I used a similar existing one (safelen). Attached, please find 2
> > fixed patches for C and C++ along with their changelogs.
> 
> But safelen is something completely different, while if I skim the _Cilk_for
> docs, the grain is really a chunk size, where the runtime library performs the
> scheduling of grain sized chunks, so using OMP_CLAUSE_SCHEDULE clause
> with OMP_CLAUSE_SCHEDULE_KIND (c) =
> OMP_CLAUSE_SCHEDULE_RUNTIME;
> OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (c) = grain_expr; sounds like what
> should be used.  OMP_CLAUSE_SAFELEN says what is the minimal
> vectorization factor the compiler can assume is safe for a simd loop.
> 

Ok. Fixed as you requested. Attached, are the fixed patches and their respective changelogs. 

Is this Ok for trunk?

-Balaji V. Iyer.

> 	Jakub

[-- Attachment #2: c-ChangeLog --]
[-- Type: application/octet-stream, Size: 4350 bytes --]

gcc/ChangeLog
2014-01-08  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* cilk-builtins.def: Added two new builtin functions called
	__cilkrts_cilk_for_32 and __cilkrts_cilk_for_64.
	* cilk-common.c (cilk_init_builtins): Likewise.
	(cilk_declare_looper): New function.
	* cilk.h (enum cilk_tree_index): Added two new fields called
	CILK_TI_F_LOOP_32 and CILK_TI_F_LOOP_64.
	(cilk_for_32_fndecl): New #define.
	(cilk_for_64_fndecl): Likewise.
	* gimple-pretty-print.c (dump_gimple_omp_for): Added a new case for
	GF_OMP_FOR_KIND_CILKFOR.  Also emitted "_Cilk_for" instead of "for ("
	for when the gimple kind is GF_OMP_FOR_KIND_CILKFOR.
	* gimple.h (enum gf_mask): Added a new field GF_OMP_FOR_KIND_CILKFOR.
	Re-arranged couple other fields to make them all in ascending order.
	(struct gimple_omp_for_iter): Added a new field called "grain."
	(gimple_cilk_for_set_grain): New function.
	(gimple_cilk_for_induction_var): Likewise.
	(gimple_cilk_for_grain): Likewise.
	* gimplify.c (gimplify_omp_for): Added code to handle gimplification
	of a _Cilk_for statement.
	* omp-low.c (struct cilk_for_information): New structure.
	(create_omp_child_function_name): Added a new bool parameter called
	is_cilk_for.  If this is set, then use a different suffix.
	(extract_omp_for_data): Added a check for _Cilk_for's kind for a
	NE_EXPR case.  Added the correct schedule type for _Cilk_for.
	(use_pointer_for_field): Reject using of pointers for the induction
	variable of the outer function.
	(find_cilk_for_stmt): New function.
	(is_cilk_for_stmt): Likewise.
	(cilk_for_check_loop_diff_type): Likewise.
	(expand_cilk_for_body): Likewise.
	(is_cilk_loop_var): Likewise.
	(cilk_find_field_value): Likewise.
	(cilk_find_component_expr): Likewise.
	(find_cilk_for_vars): Likewise.
	(insert_cilk_for_fn_call): Likewise.
	(create_omp_child_function): Added two new parameters to pass in
	whether it is a _Cilk_for body and the induction variable type.  If
	it is _Cilk_for, then create two new parameters and different function-
	type.
	(lower_rec_input_clauses): Set the new decl expr value to the
	variable for the "__cilk_init," "__cilk_cond" and "__cilk_incr"
	variables.
	(scan_omp_parallel): Added a check if the outer statement is a
	_Cilk_for and if so, then find the correct induction variable type to
	pass them into create_omp_child_function.
	(expand_omp_taskreg): Added code to extract the high and low parameters
	from the child function and then insert it in the appropriate location.
	Added a call to expand_cilk_for_body.  Allowed the insertion of the
	library calls when the taskreg being expanded is not a _Cilk_for.
	(expand_omp_for): Added a check for GF_OMP_FOR_KIND_CILKFOR for the
	for statement's kind.  If so then call insert_cilk_for_fn_call.
	* tree-core.h (enum omp_clause_schedule_kind): Added a new field
	OMP_CLAUSE_SCHEDULE_CILK_FOR.
	* tree.def (CILK_FOR): New tree.

gcc/c-family/ChangeLog
2014-01-08  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-omp.c (c_finish_omp_for): Added a check for CILK_FOR along with
	CILK_SIMD.
	* c-common.h (enum rid): Added new value called "RID_CILK_FOR."
	* c-common.c (c_common_reswords[]): Added a new field "_Cilk_for."
	* c-pragma.c (init_pragma): Added cilk grainsize pragma.
	* c-pragma.h (enum pragma_kind): Added PRAGMA_CILK_GRAINSIZE.

gcc/c/ChangeLog
2014-01-08  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-parser.c (c_parser_statement_after_labels): Added a RID_CILK_FOR
	case.
	(c_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(c_parser_omp_for_loop): Renamed the "clauses" parameter to
	"clauses_or_grain."  Added handling for _Cilk_for statements.  Set
	the grain value to the clauses location.
	(c_parser_cilk_grainsize): New function.
	(c_parser_cilk_simd): Added a new parameter called grain.  Also added
	support to parse _Cilk_for statements.

gcc/testsuite/ChangeLog
2014-01-08  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk-fors.c: New testcase.
	* c-c++-common/cilk-plus/CK/nested_cilk_for.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c: Likewise.

[-- Attachment #3: cp-ChangeLog --]
[-- Type: application/octet-stream, Size: 1532 bytes --]

gcc/cp/ChangeLog
2014-01-08  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* parser.c (cp_parser_statement): Added a RID_CILK_FOR case.
	(cp_parser_omp_for_cond): Included a check for CILK_FOR along with
	CILK_SIMD.
	(cp_parser_omp_for_loop): Overall, added support to parse _Cilk_for
	statement along with omp for statements.
	(cp_parser_cilk_grainsize): New function.
	(cp_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(cp_parser_cilk_simd): Added a new parameter for grain.  Added support
	to handle _Cilk_for loops along with #pragma simd for loops.
	* pt.c (tsubst_expr): Added CILK_FOR case.  If the tree is CILK_FOR
	then just RECUR its clauses, instead of calling tsubst_omp_clauses.
	* semantics.c (handle_omp_for_class_iterator): Added 2 new parameters.
	Added a NE_EXPR case.  Added a check for _Cilk_for statement and
	if so, then give a name for the new induction variable.
	(finish_omp_for): Added a check if the code is _Cilk_for and if true
	then insert all the iterator temporary variables into the _Cilk_for
	body.

gcc/testsuite/ChangeLog
2014-01-08  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Made certain error
	tags C specific and inserted their C++ equivalents.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* g++.dg/cilk-plus/CK/cilk-for-tplt.cc: New testcase.
	* g++.dg/cilk-plus/CK/stl_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_rev_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_test.cc: Likewise.


[-- Attachment #4: diff_c.txt --]
[-- Type: text/plain, Size: 64632 bytes --]

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 40d12bc..9d24691
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -414,6 +414,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Complex",		RID_COMPLEX,	0 },
   { "_Cilk_spawn",      RID_CILK_SPAWN, 0 },
   { "_Cilk_sync",       RID_CILK_SYNC,  0 },
+  { "_Cilk_for",        RID_CILK_FOR,   0 },
   { "_Imaginary",	RID_IMAGINARY, D_CONLY },
   { "_Decimal32",       RID_DFLOAT32,  D_CONLY | D_EXT },
   { "_Decimal64",       RID_DFLOAT64,  D_CONLY | D_EXT },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7e3ece6..0eaebf3 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -149,7 +149,7 @@ enum rid
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
   /* Cilk Plus keywords.  */
-  RID_CILK_SPAWN, RID_CILK_SYNC,
+  RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR,
   
   /* Objective-C ("AT" reserved words - they are only keywords when
      they follow '@')  */
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index ac380ee..b15cd4c 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -386,7 +386,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
   bool fail = false;
   int i;
 
-  if (code == CILK_SIMD
+  if ((code == CILK_SIMD || code == CILK_FOR) 
       && !c_check_cilk_loop (locus, TREE_VEC_ELT (declv, 0)))
     fail = true;
 
@@ -516,7 +516,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 					    0))
 		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
 					 ? LT_EXPR : GE_EXPR);
-		  else if (code != CILK_SIMD)
+		  else if (code != CILK_SIMD && code != CILK_FOR)
 		    cond_ok = false;
 		}
 	    }
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index af28085..6f22148 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1394,6 +1394,11 @@ init_pragma (void)
 
   cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false,
 				false);
+
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    cpp_register_deferred_pragma (parse_in, "cilk", "grainsize",
+				  PRAGMA_CILK_GRAINSIZE, true, false);
+
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, "pack", handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 6f1bf74..b9f09ba 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -55,6 +55,9 @@ typedef enum pragma_kind {
   /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
   PRAGMA_CILK_SIMD,
 
+  /* This pragma handles setting of grainsize for a _Cilk_for.  */
+  PRAGMA_CILK_GRAINSIZE,
+
   PRAGMA_GCC_PCH_PREPROCESS,
   PRAGMA_IVDEP,
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index f73df08..7416389 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1248,10 +1248,11 @@ static bool c_parser_objc_diagnose_bad_element_prefix
   (c_parser *, struct c_declspecs *);
 
 /* Cilk Plus supporting routines.  */
-static void c_parser_cilk_simd (c_parser *);
+static void c_parser_cilk_simd (c_parser *, tree);
 static bool c_parser_cilk_verify_simd (c_parser *, enum pragma_context);
 static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
 static tree c_parser_cilk_clause_vectorlength (c_parser *, tree, bool);
+static void c_parser_cilk_grainsize (c_parser *);
 
 /* Parse a translation unit (C90 6.7, C99 6.9).
 
@@ -4878,6 +4879,16 @@ c_parser_statement_after_labels (c_parser *parser)
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
 	  break;
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (c_parser_peek_token (parser)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      c_parser_skip_to_end_of_block_or_statement (parser);
+	    }
+	  else
+	    c_parser_cilk_simd (parser, integer_zero_node);
+	  break;
 	case RID_CILK_SYNC:
 	  c_parser_consume_token (parser);
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
@@ -9488,7 +9499,25 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       if (!c_parser_cilk_verify_simd (parser, context))
 	return false;
       c_parser_consume_pragma (parser);
-      c_parser_cilk_simd (parser);
+      c_parser_cilk_simd (parser, NULL_TREE);
+      return false;
+
+    case PRAGMA_CILK_GRAINSIZE:
+      if (!flag_enable_cilkplus)
+	{
+	  warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not"
+		   " enabled");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      if (context == pragma_external)
+	{
+	  error_at (c_parser_peek_token (parser)->location,
+		    "%<#pragma grainsize%> must be inside a function");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      c_parser_cilk_grainsize (parser);
       return false;
 
     default:
@@ -11583,7 +11612,7 @@ c_parser_omp_flush (c_parser *parser)
 
 static tree
 c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
-		       tree clauses, tree *cclauses)
+		       tree clauses_or_grain, tree *cclauses)
 {
   tree decl, cond, incr, save_break, save_cont, body, init, stmt, cl;
   tree declv, condv, incrv, initv, ret = NULL;
@@ -11591,6 +11620,9 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   int i, collapse = 1, nbraces = 0;
   location_t for_loc;
   vec<tree, va_gc> *for_block = make_tree_vector ();
+  tree clauses = code == CILK_FOR ? NULL_TREE : clauses_or_grain;
+  tree grain = code == CILK_FOR ? clauses_or_grain : NULL_TREE;
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
 
   for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl))
     if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE)
@@ -11603,11 +11635,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   condv = make_tree_vec (collapse);
   incrv = make_tree_vec (collapse);
 
-  if (!c_parser_next_token_is_keyword (parser, RID_FOR))
+  if (code != CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_FOR))
     {
       c_parser_error (parser, "for statement expected");
       return NULL;
     }
+  if (code == CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_CILK_FOR))
+    {
+      c_parser_error (parser, "_Cilk_for statement expected");
+      return NULL;
+    }
   for_loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
 
@@ -11685,7 +11724,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	    case LE_EXPR:
 	      break;
 	    case NE_EXPR:
-	      if (code == CILK_SIMD)
+	      if (code == CILK_SIMD || code == CILK_FOR)
 		break;
 	      /* FALLTHRU.  */
 	    default:
@@ -11767,6 +11806,12 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
     c_break_label = size_one_node;
   save_cont = c_cont_label;
   c_cont_label = NULL_TREE;
+
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = c_begin_omp_parallel ();
+    }
   body = push_stmt_list ();
 
   if (open_brace_parsed)
@@ -11814,6 +11859,13 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	}
     }
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = c_finish_omp_parallel (loc, NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
+
   /* Only bother calling c_finish_omp_for if we haven't already generated
      an error from the initialization parsing.  */
   if (!fail)
@@ -11859,6 +11911,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 		  }
 	    }
 	  OMP_FOR_CLAUSES (stmt) = clauses;
+	  /* If it is a _Cilk_for statement, then the OMP_FOR_CLAUSES location
+	     stores the user-defined grain value or an integer_zero_node 
+	     indicating that the runtime must compute a suitable grain.  */
+	  if (code == CILK_FOR)
+	    {
+	      tree l = build_omp_clause (EXPR_LOCATION (grain),
+					 OMP_CLAUSE_SCHEDULE);
+	      OMP_CLAUSE_SCHEDULE_KIND (l) = OMP_CLAUSE_SCHEDULE_RUNTIME;
+	      OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (l) = grain;
+	      OMP_CLAUSE_CHAIN (l) = NULL_TREE;
+	      OMP_FOR_CLAUSES (stmt) = l;
+	    }
 	}
       ret = stmt;
     }
@@ -13762,16 +13826,65 @@ c_parser_cilk_all_clauses (c_parser *parser)
   return c_finish_cilk_clauses (clauses);
 }
 
+/* This function helps parse the grainsize pragma for a _Cilk_for statement. 
+   Here is the correct syntax of this pragma: 
+	    #pragma cilk grainsize = <EXP>  */
+
+static void
+c_parser_cilk_grainsize (c_parser *parser)
+{
+  extern tree convert_to_integer (tree, tree);
+
+  /* consume the 'grainsize' keyword.  */
+  c_parser_consume_pragma (parser);
+
+  if (c_parser_require (parser, CPP_EQ, "expected %<=%>") != 0)
+    {
+      struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL);
+      if (g_expr.value && TREE_CODE (g_expr.value) == C_MAYBE_CONST_EXPR)
+	{
+	  error_at (input_location, "cannot convert grain to long integer.\n");
+	  c_parser_skip_to_pragma_eol (parser);
+	}   
+      else if (g_expr.value && g_expr.value != error_mark_node)
+	{
+	  c_parser_skip_to_pragma_eol (parser);
+	  c_token *token = c_parser_peek_token (parser);
+	  if (token && token->type == CPP_KEYWORD
+	      && token->keyword == RID_CILK_FOR)
+	    {
+	      tree grain = convert_to_integer (long_integer_type_node,
+					       g_expr.value);
+	      if (grain && grain != error_mark_node) 
+		c_parser_cilk_simd (parser, grain);
+	    }
+	  else
+	    warning (0, "grainsize pragma is not followed by %<_Cilk_for%>");
+	}
+      else
+	c_parser_skip_to_pragma_eol (parser);
+    }
+  else
+    c_parser_skip_to_pragma_eol (parser);
+}
+
 /* Main entry point for parsing Cilk Plus <#pragma simd> for
    loops.  */
 
 static void
-c_parser_cilk_simd (c_parser *parser)
+c_parser_cilk_simd (c_parser *parser, tree grain)
 {
-  tree clauses = c_parser_cilk_all_clauses (parser);
+  bool is_cilk_for = grain == NULL_TREE ? false : true;
+  tree clauses = NULL_TREE;
+  
+  if (!is_cilk_for)
+    clauses = c_parser_cilk_all_clauses (parser);
+  else 
+    clauses = grain;
   tree block = c_begin_compound_stmt (true);
   location_t loc = c_parser_peek_token (parser)->location;
-  c_parser_omp_for_loop (loc, parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  c_parser_omp_for_loop (loc, parser, code, clauses, NULL);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 }
diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def
index 9f3240a..bf319d5 100644
--- a/gcc/cilk-builtins.def
+++ b/gcc/cilk-builtins.def
@@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64")
diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c
index afe88c9..bc1092b 100644
--- a/gcc/cilk-common.c
+++ b/gcc/cilk-common.c
@@ -106,6 +106,26 @@ install_builtin (const char *name, tree fntype, enum built_in_function code,
   return fndecl;
 }
 
+/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
+
+static tree
+cilk_declare_looper (const char *name, tree type, enum built_in_function code)
+{
+  tree cb, ft, fn;
+
+  cb = build_function_type_list (void_type_node,
+                                 ptr_type_node, type, type,
+                                 NULL_TREE);
+  cb = build_pointer_type (cb);
+  ft = build_function_type_list (void_type_node,
+                                 cb, ptr_type_node, type,
+                                 integer_type_node, NULL_TREE);
+  fn = install_builtin (name, ft, code, false);
+  TREE_NOTHROW (fn) = 0;
+
+  return fn;
+}
+
 /* Creates and initializes all the built-in Cilk keywords functions and three
    structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker.
    Detailed information about __cilkrts_stack_frame and
@@ -269,6 +289,15 @@ cilk_init_builtins (void)
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 
 					 fptr_fun, BUILT_IN_CILK_SAVE_FP,
 					 false);
+  /* __cilkrts_cilk_for_32 (...);  */
+  cilk_for_32_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_32",
+                                            unsigned_intSI_type_node,
+                                            BUILT_IN_CILK_FOR_32);
+  /* __cilkrts_cilk_for_64 (...);  */
+  cilk_for_64_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_64",
+                                            unsigned_intDI_type_node,
+                                            BUILT_IN_CILK_FOR_64);
+
 }
 
 /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR.  */
diff --git a/gcc/cilk.h b/gcc/cilk.h
index d2ae931..0e98998 100644
--- a/gcc/cilk.h
+++ b/gcc/cilk.h
@@ -40,6 +40,9 @@ enum cilk_tree_index  {
   CILK_TI_F_POP,                      /* __cilkrts_pop_frame (...).  */
   CILK_TI_F_RETHROW,                  /* __cilkrts_rethrow (...).  */
   CILK_TI_F_SAVE_FP,                  /* __cilkrts_save_fp_ctrl_state (...).  */
+  CILK_TI_F_LOOP_32,                  /* __cilkrts_cilk_for_32 (...).  */
+  CILK_TI_F_LOOP_64,                  /* __cilkrts_cilk_for_64 (...).  */
+
   /* __cilkrts_stack_frame struct fields.  */
   CILK_TI_FRAME_FLAGS,                /* stack_frame->flags.  */
   CILK_TI_FRAME_PARENT,               /* stack_frame->parent.  */
@@ -77,6 +80,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 #define cilk_rethrow_fndecl           cilk_trees[CILK_TI_F_RETHROW]
 #define cilk_pop_fndecl               cilk_trees[CILK_TI_F_POP]
 #define cilk_save_fp_fndecl           cilk_trees[CILK_TI_F_SAVE_FP]
+#define cilk_for_32_fndecl            cilk_trees[CILK_TI_F_LOOP_32]
+#define cilk_for_64_fndecl            cilk_trees[CILK_TI_F_LOOP_64]
 
 #define cilk_worker_type_fndecl       cilk_trees[CILK_TI_WORKER_TYPE]
 #define cilk_frame_type_decl          cilk_trees[CILK_TI_FRAME_TYPE]
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 2d1e1c7..1e7bebf 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1158,6 +1158,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_DISTRIBUTE:
 	  pp_string (buffer, "#pragma omp distribute");
 	  break;
+	case GF_OMP_FOR_KIND_CILKFOR:
+	  gcc_assert (flag_enable_cilkplus);
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -1167,7 +1170,11 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	  if (i)
 	    spc += 2;
 	  newline_and_indent (buffer, spc);
-	  pp_string (buffer, "for (");
+	  if (flag_enable_cilkplus 
+	      && gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
+	    pp_string (buffer, "_Cilk_for (");
+	  else
+	    pp_string (buffer, "for (");
 	  dump_generic_node (buffer, gimple_omp_for_index (gs, i), spc,
 			     flags, false);
 	  pp_string (buffer, " = ");
@@ -1192,6 +1199,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	    case GE_EXPR:
 	      pp_greater_equal (buffer);
 	      break;
+	    case NE_EXPR:
+	      pp_string (buffer, "!=");
+	      break;
 	    default:
 	      gcc_unreachable ();
 	    }
diff --git a/gcc/gimple.h b/gcc/gimple.h
index df92863..42304fd 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -91,13 +91,14 @@ enum gf_mask {
     GF_CALL_ALLOCA_FOR_VAR	= 1 << 5,
     GF_CALL_INTERNAL		= 1 << 6,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
-    GF_OMP_FOR_KIND_MASK	= 3 << 0,
+    GF_OMP_FOR_KIND_MASK	= 7 << 0,
     GF_OMP_FOR_KIND_FOR		= 0 << 0,
     GF_OMP_FOR_KIND_DISTRIBUTE	= 1 << 0,
     GF_OMP_FOR_KIND_SIMD	= 2 << 0,
     GF_OMP_FOR_KIND_CILKSIMD	= 3 << 0,
-    GF_OMP_FOR_COMBINED		= 1 << 2,
-    GF_OMP_FOR_COMBINED_INTO	= 1 << 3,
+    GF_OMP_FOR_KIND_CILKFOR     = 4 << 0,
+    GF_OMP_FOR_COMBINED		= 1 << 3,
+    GF_OMP_FOR_COMBINED_INTO	= 1 << 4,
     GF_OMP_TARGET_KIND_MASK	= 3 << 0,
     GF_OMP_TARGET_KIND_REGION	= 0 << 0,
     GF_OMP_TARGET_KIND_DATA	= 1 << 0,
@@ -523,6 +524,9 @@ struct GTY(()) gimple_omp_for_iter {
 
   /* Increment.  */
   tree incr;
+
+  /* Grain value, only used by _Cilk_for.  */
+  tree grain;
 };
 
 /* GIMPLE_OMP_FOR */
@@ -4562,6 +4566,37 @@ gimple_omp_for_set_pre_body (gimple gs, gimple_seq pre_body)
   omp_for_stmt->pre_body = pre_body;
 }
 
+/* Set GRAIN to be the grain value used by Cilk runtime for OMP_FOR GS.  */
+
+static inline void
+gimple_cilk_for_set_grain (tree grain, gimple gs)
+{
+  const gimple_statement_omp_for *omp_for_stmt =
+    as_a <gimple_statement_omp_for> (gs);
+  omp_for_stmt->iter[0].grain = grain;
+}
+
+/* Returns the induction variable of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_induction_var (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->index;
+}
+
+/* Returns the GRAIN value of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_grain (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->grain;
+}
 
 /* Return the clauses associated with OMP_PARALLEL GS.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index a6e0c75..d3685e0 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6559,7 +6559,18 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
   bool simd;
   bitmap has_decl_expr = NULL;
 
+  tree grain = NULL_TREE;
+  tree orig_init = NULL_TREE, orig_cond = NULL_TREE, orig_incr = NULL_TREE;
   orig_for_stmt = for_stmt = *expr_p;
+  
+  if (TREE_CODE (for_stmt) == CILK_FOR) 
+    { 
+      /* The user cannot pass any clauses for _Cilk_for, thus the grain value
+	 in a schedule clause.  */
+      grain = OMP_FOR_CLAUSES (for_stmt);
+      grain = OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (grain);
+      OMP_FOR_CLAUSES (for_stmt) = NULL_TREE;
+    }
 
   simd = TREE_CODE (for_stmt) == OMP_SIMD
     || TREE_CODE (for_stmt) == CILK_SIMD;
@@ -6603,6 +6614,11 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     }
 
   for_body = NULL;
+  if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
+    {
+      tree it = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), 0);
+      gimplify_and_add (it, &for_pre_body);
+    }
   gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
 	      == TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
   gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
@@ -6677,7 +6693,12 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	}
       else
 	var = decl;
-
+ 
+      /* Original initial, final and increment values are necessary to compute
+	 the loop-count.  Otherwise, they are stored in variables and their
+	 context could be changed, potentially making it impossible to compute
+	 them correctly.  */
+      orig_init = TREE_OPERAND (t, 1);
       tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 			    is_gimple_val, fb_rvalue);
       ret = MIN (ret, tret);
@@ -6689,10 +6710,18 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gcc_assert (COMPARISON_CLASS_P (t));
       gcc_assert (TREE_OPERAND (t, 0) == decl);
 
-      tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-			    is_gimple_val, fb_rvalue);
-      ret = MIN (ret, tret);
-
+      if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
+	{
+	  int x = 1;
+	  orig_cond = TREE_OPERAND (t, 1);
+	  copy_tree_r (&orig_cond, &x, NULL);
+	}
+      else
+	{
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL, 
+				is_gimple_val, fb_rvalue);
+	  ret = MIN (ret, tret);
+	}
       /* Handle OMP_FOR_INCR.  */
       t = TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i);
       switch (TREE_CODE (t))
@@ -6713,6 +6742,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	    t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	    t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	    TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	    orig_incr = build_one_cst (TREE_TYPE (t));
 	    break;
 	  }
 
@@ -6726,6 +6756,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	  orig_incr = build_one_cst (TREE_TYPE (t));
 	  break;
 
 	case MODIFY_EXPR:
@@ -6753,8 +6784,16 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	      gcc_unreachable ();
 	    }
 
-	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-				is_gimple_val, fb_rvalue);
+	  orig_incr = TREE_OPERAND (t, 1);
+	  /* Right here we are just trying to extract the absolute
+	     value of the increment.  */
+	  if (TREE_CODE (t) == MINUS_EXPR
+	      || TREE_CODE  (TREE_OPERAND (t, 1)) == NEGATE_EXPR
+	      || (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST
+		  && tree_int_cst_sgn (TREE_OPERAND (t, 1)) < 1))
+	    orig_incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), orig_incr);
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body,
+				NULL, is_gimple_val, fb_rvalue);
 	  ret = MIN (ret, tret);
 	  if (c)
 	    {
@@ -6802,8 +6841,57 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 
   BITMAP_FREE (has_decl_expr);
 
+  tree incr_val = NULL_TREE, init_val = NULL_TREE, cond_val = NULL_TREE;
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      tree stmt_list = alloc_stmt_list ();
+      incr_val = create_tmp_var (TREE_TYPE (orig_incr), "__cilk_incr");
+      tree mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_incr), incr_val,
+			 orig_incr);
+      append_to_statement_list (mod, &stmt_list);
+
+      init_val = create_tmp_var (TREE_TYPE (orig_init), "__cilk_init");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_init), init_val, orig_init);
+      append_to_statement_list (mod, &stmt_list);
+
+      cond_val = create_tmp_var (TREE_TYPE (orig_cond), "__cilk_cond");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_cond), cond_val, orig_cond);
+      append_to_statement_list (mod, &stmt_list);
+  
+      gimplify_and_add (stmt_list, &for_pre_body);
+    }
   gimplify_and_add (OMP_FOR_BODY (orig_for_stmt), &for_body);
+ 
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      /* Sometimes an assign is inserted before the OMP_FOR_BODY.  So,
+	 search and find the omp for body.  */
+      gimple for_body_stmt = NULL;
+      for (gimple_stmt_iterator gsi = gsi_start (for_body); !gsi_end_p (gsi);
+	   gsi_next (&gsi))
+	{
+	  for_body_stmt = gsi_stmt (gsi);
+	  if (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL)
+	    break;
+	}
+      gcc_assert (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL);
+      tree orig_clses = gimple_omp_parallel_clauses (for_body_stmt);
+      tree new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = init_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = cond_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
 
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = incr_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      gimple_omp_parallel_set_clauses (for_body_stmt, new_clause);
+    }
   if (orig_for_stmt != for_stmt)
     for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
       {
@@ -6825,6 +6913,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break;
     case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break;
     case CILK_SIMD: kind = GF_OMP_FOR_KIND_CILKSIMD; break;
+    case CILK_FOR: kind = GF_OMP_FOR_KIND_CILKFOR; break;
     case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break;
     default:
       gcc_unreachable ();
@@ -6859,6 +6948,9 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gimple_omp_for_set_incr (gfor, i, TREE_OPERAND (t, 1));
     }
 
+  if (kind == GF_OMP_FOR_KIND_CILKFOR) 
+    gimple_cilk_for_set_grain (grain, gfor);
+
   gimplify_seq_add_stmt (pre_p, gfor);
   if (ret != GS_ALL_DONE)
     return GS_ERROR;
@@ -7880,6 +7972,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	case OMP_FOR:
 	case OMP_SIMD:
 	case CILK_SIMD:
+	case CILK_FOR:
 	case OMP_DISTRIBUTE:
 	  ret = gimplify_omp_for (expr_p, pre_p);
 	  break;
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index f1ec1c6..0beaa2a 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "tree-nested.h"
 #include "tree-eh.h"
+#include "cilk.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -198,6 +199,12 @@ struct omp_for_data
   struct omp_for_data_loop *loops;
 };
 
+/* A structure with necessary elements from _Cilk_for statement.  This
+   struct. node is passed in to WALK_STMT_INFO->INFO.  */
+typedef struct cilk_for_information {
+  bool found;
+  tree induction_var;
+} cilk_for_info;
 
 static splay_tree all_contexts;
 static int taskreg_nesting_level;
@@ -314,6 +321,9 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
   fd->have_ordered = false;
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
   fd->chunk_size = NULL_TREE;
+  if (flag_enable_cilkplus 
+      && gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
+    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
   collapse_iter = NULL;
   collapse_count = NULL;
 
@@ -391,8 +401,10 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	case GT_EXPR:
 	  break;
 	case NE_EXPR:
-	  gcc_assert (gimple_omp_for_kind (for_stmt)
-		      == GF_OMP_FOR_KIND_CILKSIMD);
+	  gcc_assert ((gimple_omp_for_kind (for_stmt)
+		       == GF_OMP_FOR_KIND_CILKSIMD)
+		      || (gimple_omp_for_kind (for_stmt)
+			  == GF_OMP_FOR_KIND_CILKFOR));
 	  break;
 	case LE_EXPR:
 	  if (POINTER_TYPE_P (TREE_TYPE (loop->n2)))
@@ -897,7 +909,31 @@ use_pointer_for_field (tree decl, omp_context *shared_ctx)
 	 variable no longer really shared.  */
       if (shared_ctx->is_nested)
 	{
-	  omp_context *up;
+	  omp_context *up = shared_ctx->outer;
+
+	  /* If VAR is the induction variable of the outer _Cilk_for, then
+	     it needs to be passed as a value not pointer since it
+	     would not be overwritten by the body.  */
+	  if (flag_enable_cilkplus
+	      && gimple_code (up->stmt) == GIMPLE_OMP_FOR
+	      && gimple_omp_for_kind (up->stmt) == GF_OMP_FOR_KIND_CILKFOR) 
+	    while (up) 
+	      { 
+		if (gimple_code (up->stmt) == GIMPLE_OMP_FOR
+		    && gimple_omp_for_kind (up->stmt)
+		    == GF_OMP_FOR_KIND_CILKFOR)
+		  {
+		    struct omp_for_data fd;
+		    /* _Cilk_for always has collapse = 1.  */
+		    struct omp_for_data_loop *loops
+		      = (struct omp_for_data_loop *)
+		      alloca (sizeof (struct omp_for_data_loop));
+		    extract_omp_for_data (up->stmt, &fd, loops);
+		    if (DECL_NAME (decl) == DECL_NAME (fd.loop.v))
+		      return false;
+		  }
+		up = up->outer;
+	      }
 
 	  for (up = shared_ctx->outer; up; up = up->outer)
 	    if (is_taskreg_ctx (up) && maybe_lookup_decl (decl, up))
@@ -1818,27 +1854,112 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	scan_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx);
 }
 
-/* Create a new name for omp child function.  Returns an identifier.  */
+/* Create a new name for omp child function.  Returns an identifier.  If 
+   IS_CILK_FOR is true then the suffix for the child function is 
+   "_cilk_for_fn."  */
 
 static tree
-create_omp_child_function_name (bool task_copy)
+create_omp_child_function_name (bool task_copy, bool is_cilk_for)
 {
+  if (is_cilk_for)
+    return clone_function_name (current_function_decl, "_cilk_for_fn");
   return (clone_function_name (current_function_decl,
 			       task_copy ? "_omp_cpyfn" : "_omp_fn"));
 }
 
+/* Helper function for walk_gimple_seq function.  *GSI_P is the gimple stmt.
+   iterator passed by walk_gimple_seq and *WI->INFO holds the CILK_FOR_INFO
+   structure.  This function sets the values inside this structure if it
+   finds a _Cilk_for statement in *GSI_P.  HANDLED_OPS_P is unused.  */
+
+static tree
+find_cilk_for_stmt (gimple_stmt_iterator *gsi_p,
+		    bool *handled_ops_p ATTRIBUTE_UNUSED,
+		    struct walk_stmt_info *wi)
+{
+  cilk_for_info *cf_info = (cilk_for_info *) wi->info;
+  gimple stmt = gsi_stmt (*gsi_p);
+
+  if (gimple_code (stmt) == GIMPLE_OMP_FOR
+      && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_CILKFOR)
+      /* For nested _Cilk_for statments, just look into the
+	 outer-most one.  */
+      && cf_info->found == false)
+    {
+      cf_info->found = true;
+      cf_info->induction_var = gimple_cilk_for_induction_var (stmt);
+    }
+  return NULL_TREE;
+}
+
+/* Returns true if STMT contains a CILK_FOR statement.  If found then
+   set *IND_VAR with induction variable.  Otherwise these values remain 
+   untouched.  IND_VAR can be NULL and if so then it is left untouched.  */
+
+static bool
+is_cilk_for_stmt (gimple stmt, tree *ind_var)
+{
+  if (!flag_enable_cilkplus)
+    return false;
+    
+  gimple_seq body = stmt;
+  struct walk_stmt_info wi;
+  cilk_for_info cf_info;
+  memset (&cf_info, 0, sizeof (cilk_for_info));
+  memset (&wi, 0, sizeof (wi));
+  wi.info = &cf_info;
+  walk_gimple_seq (body, find_cilk_for_stmt, NULL, &wi);
+  if (cf_info.found)
+    {
+      if (ind_var)
+	*ind_var = cf_info.induction_var;
+      return true;
+    }
+    
+  return false;
+}
+
+/* Returns the type of the induction variable for the child function for
+   _Cilk_for and the types for _high and _low variables based on TYPE.  */
+
+static tree
+cilk_for_check_loop_diff_type (tree type)
+{
+  if (type == integer_type_node)
+    return type;
+  else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node))
+    { 
+      if (TYPE_UNSIGNED (type)) 
+	return uint32_type_node;
+      else
+	return integer_type_node;
+    }
+  else
+    {
+      if (TYPE_UNSIGNED (type)) 
+	return uint64_type_node;
+      else
+	return long_long_integer_type_node;
+    }
+  gcc_unreachable ();
+}
+
 /* Build a decl for the omp child function.  It'll not contain a body
    yet, just the bare decl.  */
 
 static void
-create_omp_child_function (omp_context *ctx, bool task_copy)
+create_omp_child_function (omp_context *ctx, bool task_copy,
+			   bool is_cilk_for, tree cilk_var_type)
 {
   tree decl, type, name, t;
-
-  name = create_omp_child_function_name (task_copy);
+ 
+  name = create_omp_child_function_name (task_copy, is_cilk_for);
   if (task_copy)
     type = build_function_type_list (void_type_node, ptr_type_node,
 				     ptr_type_node, NULL_TREE);
+  else if (is_cilk_for)
+    type = build_function_type_list (void_type_node, ptr_type_node,
+				     cilk_var_type, cilk_var_type, NULL_TREE);
   else
     type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
 
@@ -1888,6 +2009,33 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_CONTEXT (t) = decl;
   DECL_RESULT (decl) = t;
 
+  /* _Cilk_for's child function requires two extra parameters called 
+     __low and __high that are set the by Cilk runtime when it calls this 
+     function.  */
+  if (is_cilk_for)
+    {
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__high"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__low"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+    }
   t = build_decl (DECL_SOURCE_LOCATION (decl),
 		  PARM_DECL, get_identifier (".omp_data_i"), ptr_type_node);
   DECL_ARTIFICIAL (t) = 1;
@@ -1895,6 +2043,8 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_ARG_TYPE (t) = ptr_type_node;
   DECL_CONTEXT (t) = current_function_decl;
   TREE_USED (t) = 1;
+  if (is_cilk_for)
+    DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
   DECL_ARGUMENTS (decl) = t;
   if (!task_copy)
     ctx->receiver_decl = t;
@@ -2016,7 +2166,15 @@ scan_omp_parallel (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+
+  tree ind_var = NULL_TREE;
+  bool is_cilk_for = (flag_enable_cilkplus && outer_ctx
+		      && is_cilk_for_stmt (outer_ctx->stmt, &ind_var));
+  tree cilk_var_type =
+    (is_cilk_for ? cilk_for_check_loop_diff_type (TREE_TYPE (ind_var))
+     : NULL_TREE);
+
+  create_omp_child_function (ctx, false, is_cilk_for, cilk_var_type);
   gimple_omp_parallel_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_parallel_clauses (stmt), ctx);
@@ -2061,7 +2219,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+  create_omp_child_function (ctx, false, false, NULL_TREE);
   gimple_omp_task_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_task_clauses (stmt), ctx);
@@ -2074,7 +2232,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
       DECL_ARTIFICIAL (name) = 1;
       DECL_NAMELESS (name) = 1;
       TYPE_NAME (ctx->srecord_type) = name;
-      create_omp_child_function (ctx, true);
+      create_omp_child_function (ctx, true, false, NULL_TREE);
     }
 
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
@@ -2199,7 +2357,7 @@ scan_omp_target (gimple stmt, omp_context *outer_ctx)
   TYPE_NAME (ctx->record_type) = name;
   if (kind == GF_OMP_TARGET_KIND_REGION)
     {
-      create_omp_child_function (ctx, false);
+      create_omp_child_function (ctx, false, false, NULL_TREE);
       gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
     }
 
@@ -2993,6 +3151,15 @@ lower_rec_simd_input_clauses (tree new_var, omp_context *ctx, int &max_vf,
   return true;
 }
 
+/* Returns true if the variable name in DECL matches *NAME.  */
+
+static inline bool
+is_cilk_loop_var (tree decl, const char *name)
+{
+  return (DECL_NAME (decl) && !strncmp (IDENTIFIER_POINTER (DECL_NAME (decl)), 
+					name, strlen (name))); 
+}
+
 /* Generate code to implement the input clauses, FIRSTPRIVATE and COPYIN,
    from the receiver (aka child) side and initializers for REFERENCE_TYPE
    private variables.  Initialization statements go in ILIST, while calls
@@ -3245,6 +3412,18 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 	      SET_DECL_VALUE_EXPR (new_var, x);
 	      DECL_HAS_VALUE_EXPR_P (new_var) = 1;
 
+	      /* In _Cilk_for, the increment, start and final values
+		 are stored in the clause inserted by gimplify_omp_for.  
+		 This value is used by the child function to find the 
+		 appropriate induction value function based on the 
+		 high and low parameters of the child function.  
+		 Now, we need to store the decl value expressions here so 
+		 that we can easily access them.  */
+	      if (flag_enable_cilkplus 
+		  && (is_cilk_loop_var (var, "__cilk_init") 
+		      || is_cilk_loop_var (var, "__cilk_cond")
+		      || is_cilk_loop_var (var, "__cilk_incr"))) 
+		SET_DECL_VALUE_EXPR (var, x);
 	      /* ??? If VAR is not passed by reference, and the variable
 		 hasn't been initialized yet, then we'll get a warning for
 		 the store into the omp_data_s structure.  Ideally, we'd be
@@ -4628,6 +4807,250 @@ expand_omp_build_assign (gimple_stmt_iterator *gsi_p, tree to, tree from)
     }
 }
 
+/* Returns true if T is a tree whose code is COMPONENT_REF and its field
+   matches D_F_NAME and the data argument matches D_ARG_NAME.  */
+
+static bool
+cilk_find_field_value (tree t, tree d_arg_name, tree d_f_name)
+{
+  if (TREE_CODE (t) == COMPONENT_REF)
+    {
+      tree arg = TREE_OPERAND (t, 0);
+      tree field = TREE_OPERAND (t, 1);
+      if (TREE_CODE (arg) == ADDR_EXPR || TREE_CODE (arg) == MEM_REF)
+	arg = TREE_OPERAND (arg, 0);
+      if (DECL_NAME (arg) && DECL_NAME (field)
+	  && !strcmp (IDENTIFIER_POINTER (d_arg_name),
+		      IDENTIFIER_POINTER (DECL_NAME (arg)))
+	  && !strcmp (IDENTIFIER_POINTER (d_f_name),
+		      IDENTIFIER_POINTER (DECL_NAME (field)))) 
+	return true;
+    }
+  return false;
+}
+
+/* Find the COMPONENT_REF in all the basic blocks in REGION whose 
+   data-argument is DATA_ARG and field is FIELD and then replace that 
+   COMPONENT_REF value with NEW_VALUE, a VAR_DECL.  */
+
+static void
+cilk_for_find_component_expr (struct omp_region *region, tree data_arg,
+			      tree field, tree new_value)
+{
+  vec<basic_block> bbs;
+  basic_block bb;
+  unsigned ii;
+  tree new_val = NULL_TREE;
+  bbs.create (0);
+  gather_blocks_in_sese_region (region->entry, region->exit, &bbs);
+  /* No need to push the entry bb into BBS since it doesn't get inserted
+     into the child function.  */
+  
+  tree da_name = DECL_NAME (data_arg);
+  tree df_name = DECL_NAME (field);
+  FOR_EACH_VEC_ELT (bbs, ii, bb)    
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+	 gsi_next (&gsi))
+      {
+	gimple stmt = gsi_stmt (gsi);
+	if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	  for (unsigned jj = 1; jj < gimple_num_ops (stmt); jj++)
+	    {
+	      tree *op = gimple_op_ptr (stmt, jj);
+	      if (TREE_CODE (*op) == COMPONENT_REF
+		  && cilk_find_field_value (*op, da_name, df_name))
+		{    
+		  if (TREE_TYPE (*op) == TREE_TYPE (new_value))
+		    new_val = new_value;
+		  else
+		    {
+		      tree t = fold_convert (TREE_TYPE (*op), new_value);
+		      new_val =
+			force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+						  true, GSI_NEW_STMT);
+		    }
+		  gsi_insert_before (&gsi, gimple_build_assign (*op, new_val), 
+				     GSI_NEW_STMT);
+		  *op = new_val;
+		}
+	    }
+      }
+}
+
+/* Find the initial, final and increment values in BODY_STMT's clause
+   and store them in *INIT, *FINAL and *INCR parameters respectively.  */
+
+static void
+find_cilk_for_vars (gimple body_stmt, tree *init, tree *final, tree *incr)
+{
+  /* Initial, final and increment value all start with __cilk_init,
+     __cilk_cond and __cilk_incr, respectively.  These values are defined
+     in shared clause.  Thus, we search for those.  */
+  for (tree cc = gimple_omp_parallel_clauses (body_stmt); cc; 
+       cc = OMP_CLAUSE_CHAIN (cc))
+    if (OMP_CLAUSE_CODE (cc) == OMP_CLAUSE_SHARED)
+      {
+	tree decl = OMP_CLAUSE_DECL (cc);
+	if (is_cilk_loop_var (decl, "__cilk_incr"))
+	  { 
+	    *incr = decl;
+	    if (DECL_VALUE_EXPR (*incr))
+	      *incr = DECL_VALUE_EXPR (*incr);
+	  } 
+	else if (is_cilk_loop_var (decl, "__cilk_init"))
+	  { 
+	    *init = decl;
+	    if (DECL_VALUE_EXPR (*init))
+	      *init = DECL_VALUE_EXPR (*init);
+	  }
+	else if (is_cilk_loop_var (decl, "__cilk_cond"))
+	  { 
+	    *final = decl;
+	    if (DECL_VALUE_EXPR (*final))
+	      *final = DECL_VALUE_EXPR (*final);
+	  }
+      }
+}
+ 
+/* Expand the _Cilk_for body starting at REGION.  DATA_ARG, HIGH and LOW 
+   indicates data-argument, __high and __low parameters of the child 
+   function.  */
+
+static void
+expand_cilk_for_body (struct omp_region *region, tree data_arg,
+		      tree low, tree high)
+{
+  struct omp_for_data fd;
+  struct omp_for_data_loop *loops;
+  loops
+    = (struct omp_for_data_loop *)
+      alloca (gimple_omp_for_collapse (last_stmt (region->outer->entry))
+	      * sizeof (struct omp_for_data_loop));
+  extract_omp_for_data (last_stmt (region->outer->entry), &fd, loops);
+  region->sched_kind = fd.sched_kind;
+  basic_block entry_bb = region->entry;
+  
+  /* This is where the body is and the location where we must insert
+     the modification to the induction variable.  */
+  basic_block body_bb = single_succ (region->entry);
+  gimple entry_stmt = last_stmt (region->entry);
+  
+  /* Split the first basic block into two and put the initializer values
+     in the top one.  */
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  basic_block l1_bb = split_block (entry_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (l1_bb);
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd.loop.v));
+  tree ind_var = create_tmp_reg (type, "__cilk_ind_var");
+  tree t = fold_convert (type, low);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_NEW_STMT);
+  gimple stmt = gimple_build_assign (ind_var, fold_convert (type, t));
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  vec_alloc (region->ws_args, 2);
+  tree t1 = null_pointer_node;
+  tree t2 = build_fold_addr_expr (gimple_omp_parallel_child_fn (entry_stmt));
+  if (data_arg)
+    {
+      t1 = build_fold_addr_expr (gimple_omp_parallel_data_arg (entry_stmt));
+      gsi = gsi_start_bb (body_bb);
+      tree init = NULL_TREE, final_val = NULL_TREE, incr = NULL_TREE;
+      find_cilk_for_vars (entry_stmt, &init, &final_val, &incr);
+
+      tree step = fd.loop.step;
+      if (TREE_CODE (fd.loop.step) != INTEGER_CST)
+	step = incr;      
+      step = fold_convert (type, step);
+      if (TREE_CODE (step) == INTEGER_CST && tree_int_cst_sgn (step) < 1)
+	step = fold_build1_loc (UNKNOWN_LOCATION, NEGATE_EXPR, type, step);
+      
+      tree tmp = create_tmp_reg (type, NULL);
+      gsi_insert_before (&gsi, gimple_build_assign (tmp, step),
+			 GSI_NEW_STMT);
+      t = build2 (MULT_EXPR, type, ind_var, tmp);
+      tree tmp2 = create_tmp_reg (type, NULL);
+      gsi_insert_after (&gsi, gimple_build_assign (tmp2, t), GSI_NEW_STMT);
+
+      tmp = create_tmp_reg (type, NULL);
+      init = fold_convert (type, init);
+      tree init_tmp = force_gimple_operand_gsi
+	(&gsi, init, true, NULL_TREE, false, GSI_CONTINUE_LINKING); 
+
+      gsi_insert_after (&gsi, gimple_build_assign (tmp, init_tmp), 
+			GSI_NEW_STMT);
+      if (fd.loop.cond_code == GE_EXPR || fd.loop.cond_code == GT_EXPR) 
+	t = fold_build2 (MINUS_EXPR, type, tmp, tmp2);
+      else 
+	t = fold_build2 (PLUS_EXPR, type, tmp, tmp2);
+
+      t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false, 
+				    GSI_CONTINUE_LINKING);
+      tree tmp3 = create_tmp_reg (type, NULL);
+      gimple stmt = gimple_build_assign (tmp3, t);
+      gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+      cilk_for_find_component_expr (region, data_arg, fd.loop.v, tmp3);
+    }
+  region->ws_args->quick_push (t1);
+  region->ws_args->quick_push (t2);
+  
+  gsi = gsi_last_bb (l1_bb);
+  basic_block cond_bb = split_block (l1_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (l1_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (cond_bb);
+  t = fold_convert (type, high);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_CONTINUE_LINKING);
+  t = build2 (LT_EXPR, boolean_type_node, ind_var, t);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p,
+		 NULL, NULL)
+      || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p,
+		    NULL, NULL))
+    {
+      gsi = gsi_for_stmt (stmt);
+      gimple_regimplify_operands (stmt, &gsi);
+    }
+
+  /* Insert incrementing of induction variable.  */
+  gsi = gsi_last_bb (body_bb);
+  t = build2 (PLUS_EXPR, type, ind_var, build_one_cst (type));
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+  gsi_insert_after (&gsi, gimple_build_assign (ind_var, t),
+		    GSI_CONTINUE_LINKING);
+  
+  basic_block exit_bb = region->exit;
+
+  gsi = gsi_last_bb (exit_bb);
+  basic_block last_bb = split_block (exit_bb, gsi_stmt (gsi))->dest;
+  
+  /* Remove the #pragma omp return.  */
+  gsi = gsi_last_bb (exit_bb);
+  gsi_remove (&gsi, true);
+  
+  gsi = gsi_last_bb (last_bb);
+  gsi_insert_before (&gsi, gimple_build_return (NULL), GSI_SAME_STMT);
+  
+  /* Now connect all the basic-blocks.  */
+  edge e = make_edge (cond_bb, last_bb, EDGE_FALSE_VALUE);
+  e->probability = REG_BR_PROB_BASE / 4;
+
+  edge e3 = find_edge (cond_bb, body_bb);
+  e3->probability = REG_BR_PROB_BASE * 3 / 4;
+  e3->flags = EDGE_TRUE_VALUE;
+  
+  edge e2 = find_edge (exit_bb, last_bb);
+  remove_edge (e2);
+  e2 = make_edge (exit_bb, cond_bb, EDGE_FALLTHRU);
+  e2->probability = 1;
+  region->exit = last_bb;
+}
+
 /* Expand the OpenMP parallel or task directive starting at REGION.  */
 
 static void
@@ -4640,6 +5063,7 @@ expand_omp_taskreg (struct omp_region *region)
   gimple entry_stmt, stmt;
   edge e;
   vec<tree, va_gc> *ws_args;
+  gimple parcopy_stmt = NULL;
 
   entry_stmt = last_stmt (region->entry);
   child_fn = gimple_omp_taskreg_child_fn (entry_stmt);
@@ -4648,6 +5072,16 @@ expand_omp_taskreg (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
+  /* The way _Cilk_for is constructed in the compiler is like making
+     the _Cilk_for statment a #pragma OMP for and the body of it is
+     enclosed in #pragma omp parallel.  In this routine, we handle
+     inserting the body into the child function and putting a loop around
+     it to go from low to high.  NOTE: Even though this is how the 
+     compiler breaks them, they do NOT function the same way.  */
+  bool is_cilk_for =
+    (flag_enable_cilkplus && region->outer
+     && is_cilk_for_stmt (last_stmt (region->outer->entry), NULL));
+    
   if (is_combined_parallel (region))
     ws_args = region->ws_args;
   else
@@ -4698,7 +5132,6 @@ expand_omp_taskreg (struct omp_region *region)
 	  basic_block entry_succ_bb = single_succ (entry_bb);
 	  gimple_stmt_iterator gsi;
 	  tree arg, narg;
-	  gimple parcopy_stmt = NULL;
 
 	  for (gsi = gsi_start_bb (entry_succ_bb); ; gsi_next (&gsi))
 	    {
@@ -4755,6 +5188,29 @@ expand_omp_taskreg (struct omp_region *region)
 	    }
 	}
 
+      /* Extract the __high and __low parameter from the function.  */
+      tree high_arg = NULL_TREE, low_arg = NULL_TREE;
+      if (is_cilk_for)
+	{
+	  for (tree ii_arg = DECL_ARGUMENTS (child_fn); ii_arg != NULL_TREE;
+	       ii_arg = TREE_CHAIN (ii_arg))
+	    {
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)),
+			   "__high"))
+		high_arg = ii_arg;
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__low"))
+		low_arg = ii_arg;
+	    }
+	  gcc_assert (high_arg);
+	  gcc_assert (low_arg);
+	  expand_cilk_for_body (region, gimple_get_lhs (parcopy_stmt),
+				low_arg, high_arg);
+
+	  /* A new BB is added to the end of EXIT_BB and thus it needs to be
+	     updated.  */
+	  exit_bb = region->exit;
+	}
+
       /* Declare local variables needed in CHILD_CFUN.  */
       block = DECL_INITIAL (child_fn);
       BLOCK_VARS (block) = vec2chain (child_cfun->local_decls);
@@ -4787,7 +5243,7 @@ expand_omp_taskreg (struct omp_region *region)
       single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
 
       /* Convert GIMPLE_OMP_RETURN into a RETURN_EXPR.  */
-      if (exit_bb)
+      if (exit_bb && !is_cilk_for)
 	{
 	  gsi = gsi_last_bb (exit_bb);
 	  gcc_assert (!gsi_end_p (gsi)
@@ -4861,11 +5317,16 @@ expand_omp_taskreg (struct omp_region *region)
       pop_cfun ();
     }
 
-  /* Emit a library call to launch the children threads.  */
-  if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
-    expand_parallel_call (region, new_bb, entry_stmt, ws_args);
-  else
-    expand_task_call (new_bb, entry_stmt);
+  /* In _Cilk_for, the call to the runtime function is inserted by
+     expand_omp_for.  */
+  if (!is_cilk_for)
+    {
+      /* Emit a library call to launch the children threads.  */
+      if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
+	expand_parallel_call (region, new_bb, entry_stmt, ws_args);
+      else
+	expand_task_call (new_bb, entry_stmt);
+    }
   if (gimple_in_ssa_p (cfun))
     update_ssa (TODO_update_ssa_only_virtuals);
 }
@@ -6540,6 +7001,122 @@ expand_omp_for_static_chunk (struct omp_region *region,
     }
 }
 
+/* A subroutine of expand_omp_for.  Insert the function call to the
+   cilk library function-call: __cilkrts_cilk_for_64/32 into the end of
+   REGION.  Loop information is calculated using step, n1 and n2 from FD.  */
+
+static void
+insert_cilk_for_fn_call (struct omp_region *region, struct omp_for_data *fd)
+{
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  basic_block entry_bb = region->entry;
+  bool broken_loop = region->cont == NULL;
+  basic_block cont_bb = region->cont;
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  tree diff_type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  tree grain = gimple_cilk_for_grain (fd->for_stmt);
+  
+  /* Convert n2 and n1 to the type we need.  */
+  tree n1 = fold_convert (diff_type, fd->loop.n1);
+  tree n2 = fold_convert (diff_type, fd->loop.n2);
+
+  n1 = force_gimple_operand_gsi (&gsi, n1, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  n2 = force_gimple_operand_gsi (&gsi, n2, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  tree diff_val = fold_build2 (MINUS_EXPR, diff_type, n2, n1);
+
+  diff_val = force_gimple_operand_gsi (&gsi, diff_val, true, NULL_TREE,
+					    true, GSI_SAME_STMT);
+  tree step = fd->loop.step;
+  tree step_convert = force_gimple_operand_gsi (&gsi,
+						fold_convert (diff_type, step),
+						true, NULL_TREE, true,
+						GSI_SAME_STMT);
+  tree count = fold_build2 (TRUNC_DIV_EXPR, diff_type, diff_val, step_convert);
+  count = force_gimple_operand_gsi (&gsi, count, true, NULL_TREE, true,
+				    GSI_SAME_STMT);
+
+  tree data_arg_ptr = (*region->ws_args)[0];
+  tree child_fn = (*region->ws_args)[1];
+
+  tree lib_fun = NULL_TREE;
+  if (TYPE_PRECISION (type) == 32)
+    lib_fun = cilk_for_32_fndecl;
+  else if (TYPE_PRECISION (type) == 64)
+    lib_fun = cilk_for_64_fndecl;
+  else
+    gcc_unreachable ();
+
+  vec<tree, va_gc> *args;
+  vec_alloc (args, 4);
+  args->quick_push (child_fn);
+  args->quick_push (data_arg_ptr);
+  args->quick_push (count);
+  args->quick_push (grain);
+  tree t = build_call_expr_loc_vec (UNKNOWN_LOCATION, lib_fun, args);
+  gsi_remove (&gsi, true);
+
+  if (!broken_loop)
+    {
+      /* Code to control the increment goes in the CONT_BB.  */
+      gsi = gsi_last_bb (cont_bb);
+      gimple stmt = gsi_stmt (gsi);
+      gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE);
+      gsi_remove (&gsi, true);
+      
+      /* remove the edge to OMP continue block.  */
+      unsigned int ii = 0;
+      while (EDGE_COUNT (cont_bb->succs) > 1)
+	{
+	  edge ee = EDGE_SUCC (cont_bb, ii);
+	  if (!(ee->flags & EDGE_FALLTHRU))
+	    remove_edge (ee);
+	  ii++;
+	}      
+      gsi = gsi_start_bb (cont_bb);
+      gsi_remove (&gsi, true);
+      force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+    }
+
+  /* Remove GIMPLE_OMP_RETURN.  */
+  gsi = gsi_last_bb (region->exit);
+  gimple stmt = gsi_stmt (gsi);
+  gcc_assert (gimple_code (stmt) == GIMPLE_OMP_RETURN);
+  gsi_remove (&gsi, true);
+
+  gsi = gsi_last_bb (region->entry);
+  t = fold_build2 (fd->loop.cond_code, boolean_type_node, n1, n2);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  /* In here we are replacing a _Cilk_for statement with something
+     like this:
+
+     if (n1 <cond_code> n2)
+       goto bb1
+     else
+       goto bb2
+     
+     bb1:
+       .omp_data.o.__cilk_incr = __cilk_incr;
+       ...
+       __cilkrts_cilk_for_{32/64} (func_name, &omp_data_0, <count>, <grain>);
+
+     bb2:
+     clobber all values and go out.  */  
+  unsigned int ii = 0;
+  while (ii < EDGE_COUNT (region->entry->succs))
+    {
+      edge ee = EDGE_SUCC (region->entry, ii);
+      if (ee->flags & EDGE_FALLTHRU)
+	ee->flags = EDGE_TRUE_VALUE;
+      else
+	ee->flags = EDGE_FALSE_VALUE;
+      ii++;
+    }
+}
 
 /* A subroutine of expand_omp_for.  Generate code for a simd non-worksharing
    loop.  Given parameters:
@@ -6880,6 +7457,12 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt)
 
   if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_SIMD)
     expand_omp_simd (region, &fd);
+  else if (flag_enable_cilkplus 
+	   && (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_CILKFOR))
+    {
+      region->ws_args = region->inner->ws_args;
+      insert_cilk_for_fn_call (region, &fd);
+    }
   else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
 	   && !fd.have_ordered)
     {
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
new file mode 100644
index 0000000..a80f413
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
@@ -0,0 +1,100 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+static void check (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start;  ii < end; ii = ii + incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+static void check_reverse (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start; ii >= end; ii = ii - incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+
+int main (void)
+{
+  int Array[10];
+  int x = 9, y = 0, z = 3;
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 1133;
+  check (Array, 0, 10, 1, 1133);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 9; ii > -1; ii--)
+    Array[ii] = 4433;
+  check_reverse (Array, 9, 0, 1, 4433);
+
+  _Cilk_for (int ii = 9; ii > -1; --ii)
+    Array[ii] = 9988;
+  check_reverse (Array, 9, 0, 1, 9988);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    Array[ii] = 1328;
+  check (Array, 0, 10, 2, 1328);
+
+  _Cilk_for (int ii = 9; ii >= 0; ii -= 2)
+    Array[ii] = 1738;
+  check_reverse (Array, 9, 0, 2, 1738);
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if (ii % 2)
+	Array[ii] = 1343;
+      else
+	Array[ii] = 3413;
+    }
+
+  check (Array, 1, 10, 2, 1343); 
+  check (Array, 0, 10, 2, 3413); 
+
+  _Cilk_for (short cc = 0; cc < 10; cc++) 
+    Array[cc] = 1343;
+  check (Array, 0, 10,  1,1343);
+
+  _Cilk_for (short cc = 9; cc >= 0; cc--)
+    Array[cc] = 1348;
+  check_reverse (Array, 9, 0, 1, 1348);
+
+
+
+  /* Loop with polynomials in _Cilk_for.  */
+  _Cilk_for (int ii = z - 3; ii <= z * 3; ii += 2)
+    { 
+      Array[ii] = 3233;
+    }
+
+  for (int ii = z-3; ii <= z*3; ii += 2)
+    if (Array[ii] != 3233)
+      __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
new file mode 100644
index 0000000..0ebc09a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+int main (void)
+{
+  int q = 0, ii = 0, jj = 0;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+
+  _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 5;
+
+  _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
+    q = 5;
+
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static variable" } */
+    q = 5;
+
+  _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */
+    q = 5;
+
+  _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
+    q = 5;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+    q = 5;
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
new file mode 100644
index 0000000..6cb9b03
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
@@ -0,0 +1,35 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int grain_value = 2;
+int main (void)
+{
+  int Array1[200], Array1_Serial[200];
+
+  for (int ii = 0; ii < 200; ii++)
+    {
+      Array1_Serial[ii] = 2;
+      Array1[ii] = 1;
+    }
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 200; ii++)
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+#pragma cilk grainsize = grain_value
+  _Cilk_for (int ii = 0; ii < 200; ii++) 
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
new file mode 100644
index 0000000..ff8bc0a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+char Array1[26];
+
+#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */
+
+int main(int argc, char **argv)
+{
+/* This is OK.  */
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+
+/* This is OK, it will do a type conversion to long int.  */
+#pragma cilk grainsize = 0.5 
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize = 1 
+  while (Array1[5] != 0) /* { dg-warning "grainsize pragma is not followed" } */
+    {
+    /* Blah */
+    }
+
+#pragma cilk grainsize = 1 
+  int q = 0; /* { dg-warning "grainsize pragma is not followed" } */
+  _Cilk_for (q = 0; q < 10; q++)
+    Array1[q]  = 5;
+
+  while (Array1[5] != 0)
+    {
+    /* Blah */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
new file mode 100644
index 0000000..7a779f7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
@@ -0,0 +1,41 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+/* <feature> loop control variable must have integer, pointer or class type
+   </feature>
+*/
+
+#define ARRAY_SIZE 10000
+int a[ARRAY_SIZE];
+
+int main(void)
+{ 
+  int ii = 0;
+
+#if 1
+  for (ii =0; ii < ARRAY_SIZE; ii++)
+    a[ii] = 5;
+#endif
+  _Cilk_for(int *aa = a; aa < a + ARRAY_SIZE; aa++) 
+    *aa = 0;
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii++) 
+    if (a[ii] != 0) 
+      __builtin_abort ();
+#endif
+
+  _Cilk_for (int *aa = a; aa < a + ARRAY_SIZE; aa = aa + 2)
+    *aa = 4;
+
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii = ii + 2) 
+    if (a[ii] != 4) 
+      __builtin_abort ();
+#endif
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
new file mode 100644
index 0000000..cffe17e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
@@ -0,0 +1,79 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+int main (void)
+{
+  int Array[10][10];
+
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj++)
+	{
+	  Array[ii][jj] = 0;
+	}
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 5; jj++)
+      Array[ii][jj] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 5; jj++)
+      if (Array[ii][jj] != 5)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+
+  /* One goes up and one goes down.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 9; jj >= 0; jj--)
+      Array[ii][jj] = 7;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 9; jj >= 0; jj--)
+      if (Array[ii][jj] != 7)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 10; jj += 2)
+      Array[ii][jj] = 9;
+  
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj += 2)
+      if (Array[ii][jj] != 9)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    _Cilk_for (int jj = 5; jj < 9; jj++)
+      Array[ii][jj] = 11; 
+  
+  for (int ii = 0; ii < 10; ii += 2)
+    for (int jj = 5; jj < 9; jj++)
+      if (Array[ii][jj] != 11)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 0a41b86..988408a 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -351,6 +351,7 @@ enum omp_clause_schedule_kind {
   OMP_CLAUSE_SCHEDULE_GUIDED,
   OMP_CLAUSE_SCHEDULE_AUTO,
   OMP_CLAUSE_SCHEDULE_RUNTIME,
+  OMP_CLAUSE_SCHEDULE_CILKFOR,
   OMP_CLAUSE_SCHEDULE_LAST
 };
 
diff --git a/gcc/tree.def b/gcc/tree.def
index f8d6444..558d7c8 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1051,6 +1051,10 @@ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6)
    Operands like for OMP_FOR.  */
 DEFTREECODE (CILK_SIMD, "cilk_simd", tcc_statement, 6)
 
+/* Cilk Plus - _Cilk_for (..)
+   Operands like for OMP_FOR.  */
+DEFTREECODE (CILK_FOR, "cilk_for", tcc_statement, 6)
+
 /* OpenMP - #pragma omp distribute [clause1 ... clauseN]
    Operands like for OMP_FOR.  */
 DEFTREECODE (OMP_DISTRIBUTE, "omp_distribute", tcc_statement, 6)

[-- Attachment #5: diff_c++.txt --]
[-- Type: text/plain, Size: 21514 bytes --]

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c99c1fc..91e32e3 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -237,8 +237,8 @@ static void cp_parser_initial_pragma
 static tree cp_literal_operator_id
   (const char *);
 
-static void cp_parser_cilk_simd
-  (cp_parser *, cp_token *);
+static tree cp_parser_cilk_simd
+  (cp_parser *, cp_token *, tree);
 static bool cp_parser_omp_declare_reduction_exprs
   (tree, cp_parser *);
 
@@ -9364,6 +9364,18 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 	  statement = cp_parser_iteration_statement (parser, false);
 	  break;
 
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (cp_lexer_peek_token (parser->lexer)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      cp_lexer_consume_token (parser->lexer);
+	      statement = error_mark_node;
+	    }
+	  else
+	    statement = cp_parser_cilk_simd (parser, NULL, integer_zero_node);
+	  break;
+
 	case RID_BREAK:
 	case RID_CONTINUE:
 	case RID_RETURN:
@@ -28694,7 +28706,7 @@ cp_parser_omp_for_cond (cp_parser *parser, tree decl, enum tree_code code)
     case LE_EXPR:
       break;
     case NE_EXPR:
-      if (code == CILK_SIMD)
+      if (code == CILK_SIMD || code == CILK_FOR)
 	break;
       /* Fall through: OpenMP disallows NE_EXPR.  */
     default:
@@ -29019,11 +29031,18 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       bool add_private_clause = false;
       location_t loc;
 
-      if (!cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
+      if (code == CILK_SIMD
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
 	{
 	  cp_parser_error (parser, "for statement expected");
 	  return NULL;
 	}
+      if (code == CILK_FOR
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_CILK_FOR))
+	{
+	  cp_parser_error (parser, "_Cilk_for statement expected");
+	  return NULL;
+	}
       loc = cp_lexer_consume_token (parser->lexer)->location;
 
       if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
@@ -29032,13 +29051,26 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       init = decl = real_decl = NULL;
       this_pre_body = push_stmt_list ();
 
+      if (code == CILK_FOR
+	  && cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC))
+	{
+	  error_at (cp_lexer_peek_token (parser->lexer)->location,
+		    "induction variable cannot be static");
+	  cp_lexer_consume_token (parser->lexer);
+	}
       add_private_clause
 	|= cp_parser_omp_for_loop_init (parser,
-					/*parsing_openmp=*/code != CILK_SIMD,
+					/*parsing_openmp=*/
+					(code != CILK_SIMD && code != CILK_FOR),
 					this_pre_body, for_block,
 					init, decl, real_decl);
 
-      cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
+      if (!cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON)
+	  && code == CILK_FOR)
+	{
+	  cp_parser_skip_to_end_of_statement (parser);
+	  cp_parser_consume_semicolon_at_end_of_statement (parser);
+	}
       if (this_pre_body)
 	{
 	  this_pre_body = pop_stmt_list (this_pre_body);
@@ -29196,17 +29228,30 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
 
   /* Note that we saved the original contents of this flag when we entered
      the structured block, and so we don't need to re-save it here.  */
-  if (code == CILK_SIMD)
+  if (code == CILK_SIMD || code == CILK_FOR)
     parser->in_statement = IN_CILK_SIMD_FOR;
   else
     parser->in_statement = IN_OMP_FOR;
 
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = begin_omp_parallel ();
+    }
+
   /* Note that the grammar doesn't call for a structured block here,
      though the loop as a whole is a structured block.  */
   body = push_stmt_list ();
   cp_parser_statement (parser, NULL_TREE, false, NULL);
   body = pop_stmt_list (body);
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = finish_omp_parallel (NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
   if (declv == NULL_TREE)
     ret = NULL_TREE;
   else
@@ -31084,6 +31129,38 @@ cp_parser_initial_pragma (cp_token *first_token)
   cp_lexer_get_preprocessor_token (NULL, first_token);
 }
 
+/* Parses the grainsize pragma for the _Cilk_for statement.
+   Syntax:
+   #pragma cilk grainsize = <VALUE>.  */
+
+static void
+cp_parser_cilk_grainsize (cp_parser *parser, cp_token *pragma_tok)
+{
+  if (cp_parser_require (parser, CPP_EQ, RT_EQ))
+    {
+      tree exp = cp_parser_binary_expression (parser, false, false,
+                                              PREC_NOT_OPERATOR, NULL);
+      cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+      if (!exp || exp == error_mark_node)
+        {
+          error_at (pragma_tok->location, "invalid grainsize for _Cilk_for");
+          return;
+        }
+      cp_token *n_tok = cp_lexer_peek_token (parser->lexer);
+
+      /* Make sure the next token is _Cilk_for, it is invalid otherwise.  */
+      if (n_tok && n_tok->type == CPP_KEYWORD
+	  && n_tok->keyword == RID_CILK_FOR)
+	cp_parser_cilk_simd (parser, NULL, exp);
+      else
+	warning_at (cp_lexer_peek_token (parser->lexer)->location, 0,
+		    "%<#pragma cilk grainsize%> is not followed by "
+		    "%<_Cilk_for%>");
+      return;
+    }
+  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+}
+
 /* Normal parsing of a pragma token.  Here we can (and must) use the
    regular lexer.  */
 
@@ -31263,9 +31340,30 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
 		    "%<#pragma simd%> must be inside a function");
 	  break;
 	}
-      cp_parser_cilk_simd (parser, pragma_tok);
+      cp_parser_cilk_simd (parser, pragma_tok, NULL_TREE);
       return true;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (context == pragma_external)
+        {
+          error_at (pragma_tok->location,
+                    "%<#pragma cilk grainsize%> must be inside a function");
+          break;
+        }
+
+      /* Ignore the pragma if Cilk Plus is not enabled.  */
+      if (flag_enable_cilkplus)
+        {
+          cp_parser_cilk_grainsize (parser, pragma_tok);
+          return true;
+        }
+      else
+        {
+          error_at (pragma_tok->location, "-fcilkplus must be enabled to use "
+                    "%<#pragma cilk grainsize%>");
+          break;
+	}
+      
     default:
       gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
       c_invoke_pragma_handler (id);
@@ -31555,31 +31653,64 @@ cp_parser_cilk_simd_all_clauses (cp_parser *parser, cp_token *pragma_token)
     return c_finish_cilk_clauses (clauses);
 }
 
-/* Main entry-point for parsing Cilk Plus <#pragma simd> for loops.  */
+/* Main entry-point for parsing Cilk Plus <#pragma simd> for and _Cilk_for
+   loops.  This function returns NULL_TREE whenever it is parsing the
+   #pragma simd's for because the caller does not check the return value.
+   _Cilk_for's caller checks this value and thus return error_mark_node
+   when errors happen and a valid value when things go as expected.  */
 
-static void
-cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token)
+static tree
+cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token, tree grain)
 {
-  tree clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
+  bool is_cilk_for = !pragma_token ? true: false;
+  tree clauses = NULL_TREE;
+
+  if (!is_cilk_for)
+    clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
 
   if (clauses == error_mark_node)
-    return;
-  
-  if (cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
+    return NULL_TREE;
+
+  if (!is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
     {
       error_at (cp_lexer_peek_token (parser->lexer)->location,
 		"for statement expected");
-      return;
+      return NULL_TREE;
+    }
+  if (is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_CILK_FOR))
+    {
+      error_at (cp_lexer_peek_token (parser->lexer)->location,
+		"_Cilk_for statement expected");
+      return error_mark_node;
     }
 
   tree sb = begin_omp_structured_block ();
   int save = cp_parser_begin_omp_structured_block (parser);
-  tree ret = cp_parser_omp_for_loop (parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  tree ret = cp_parser_omp_for_loop (parser, code, clauses, NULL);
   if (ret)
     cpp_validate_cilk_plus_loop (OMP_FOR_BODY (ret));
+
+  /* For _Cilk_for statements, the grain value is stored in the same
+     location as clauses for OMP for.  */
+  if (is_cilk_for && ret)
+    { 
+      tree l = build_omp_clause (EXPR_LOCATION (grain),
+				 OMP_CLAUSE_SCHEDULE);
+      OMP_CLAUSE_SCHEDULE_KIND (l) = OMP_CLAUSE_SCHEDULE_RUNTIME;
+      OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (l) = grain;
+      OMP_CLAUSE_CHAIN (l) = NULL_TREE;
+      OMP_FOR_CLAUSES (ret) = l;
+    }
+
   cp_parser_end_omp_structured_block (parser, save);
-  add_stmt (finish_omp_structured_block (sb));
-  return;
+  tree stmt = finish_omp_structured_block (sb);
+  add_stmt (stmt);
+  if (is_cilk_for) 
+    return stmt;
+  return NULL_TREE;
 }
 
 /* Create an identifier for a generic parameter type (a synthesized
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 98d7365..8efdb03 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13575,6 +13575,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
     case OMP_FOR:
     case OMP_SIMD:
     case CILK_SIMD:
+    case CILK_FOR:
     case OMP_DISTRIBUTE:
       {
 	tree clauses, body, pre_body;
@@ -13582,8 +13583,23 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
 	tree incrv = NULL_TREE;
 	int i;
 
-	clauses = tsubst_omp_clauses (OMP_FOR_CLAUSES (t), false,
-				      args, complain, in_decl);
+	/* We cannot use the tsubst_omp_clauses since it will try to
+	   do checking such as whether a certain clause can be used
+	   with a certain for-loop.  We are just use schedule clause here 
+	   as a holder to hold the grain value.  */
+	if (TREE_CODE (t) == CILK_FOR)
+	  {
+	    tree grain = OMP_FOR_CLAUSES (t);
+	    grain = RECUR (OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (grain));
+	    clauses = build_omp_clause (EXPR_LOCATION (grain), 
+					OMP_CLAUSE_SCHEDULE); 
+	    OMP_CLAUSE_SCHEDULE_KIND (clauses) = OMP_CLAUSE_SCHEDULE_RUNTIME; 
+	    OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (clauses) = grain;
+	    OMP_CLAUSE_CHAIN (clauses) = NULL_TREE;
+	  } 
+	else
+	  clauses = tsubst_omp_clauses (OMP_FOR_CLAUSES (t), false,
+					args, complain, in_decl);
 	if (OMP_FOR_INIT (t) != NULL_TREE)
 	  {
 	    declv = make_tree_vec (TREE_VEC_LENGTH (OMP_FOR_INIT (t)));
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 0bb64c7..cc1a013 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -5965,7 +5965,8 @@ finish_omp_task (tree clauses, tree body)
 static bool
 handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
 			       tree condv, tree incrv, tree *body,
-			       tree *pre_body, tree clauses)
+			       tree *pre_body, tree clauses,
+			       bool is_cilk_for)
 {
   tree diff, iter_init, iter_incr = NULL, last;
   tree incr_var = NULL, orig_pre_body, orig_body, c;
@@ -5985,6 +5986,7 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
     case GE_EXPR:
     case LT_EXPR:
     case LE_EXPR:
+    case NE_EXPR:
       if (TREE_OPERAND (cond, 1) == iter)
 	cond = build2 (swap_tree_comparison (TREE_CODE (cond)),
 		       TREE_TYPE (cond), iter, TREE_OPERAND (cond, 0));
@@ -6128,6 +6130,11 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
       break;
 
   decl = create_temporary_var (TREE_TYPE (diff));
+  /* In _Cilk_for we must know the induction variable name since it is
+     read by expand_cilk_for_body in omp-low.c to set the induction
+     variable in the child function correctly.  */
+  if (is_cilk_for)
+    DECL_NAME (decl) = make_anon_name ();
   pushdecl (decl);
   add_decl_expr (decl);
   last = create_temporary_var (TREE_TYPE (diff));
@@ -6343,8 +6350,24 @@ finish_omp_for (location_t locus, enum tree_code code, tree declv, tree initv,
 				"iteration variable %qE", decl);
 	      return NULL;
 	    }
-	  if (handle_omp_for_class_iterator (i, locus, declv, initv, condv,
-					     incrv, &body, &pre_body, clauses))
+
+	  /* In _Cilk_for, all the iterator mapping code should be
+	     inserted in the OMP_PARALLEL_BODY.  */
+	  if (code == CILK_FOR)
+	    {
+	      tree the_body = OMP_PARALLEL_BODY (body);
+	      if (TREE_CODE (the_body) == BIND_EXPR)
+		the_body = BIND_EXPR_BODY (the_body);
+	      if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						 condv, incrv, &the_body,
+						 &pre_body, clauses, true))
+		return NULL;
+	      else
+		BIND_EXPR_BODY (OMP_PARALLEL_BODY (body)) = the_body;
+	    }
+	  else if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						  condv, incrv, &body,
+						  &pre_body, clauses, false))
 	    return NULL;
 	  continue;
 	}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
index 0ebc09a..ed73c34 100644
--- a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -7,7 +7,8 @@ int main (void)
 {
   int q = 0, ii = 0, jj = 0;
 
-  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */
+    /* { dg-error "expected" "" { target c++ } 10 } */
     q = 5;
 
   _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
@@ -16,24 +17,30 @@ int main (void)
   _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
     q = 2;
 
-  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" } */
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" "" { target c } } */
+    /* { dg-error "invalid controlling predicate" "" { target c++ }  20 } */
     q = 5;
 
   _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
     q = 5;
 
-  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected" } */ 
     q = 5;
 
   _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
     q = 5;
 
-  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static variable" } */
+ 
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static" } */
+
     q = 5;
 
+
   _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
     q = 5;
 
+
   _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
     q = 5;
 
@@ -43,7 +50,9 @@ int main (void)
   _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
     q = 5;
 
-  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */ 
+    /* { dg-error "expected" "" { target c++ }  53 } */
     q = 5;
+
   return 0;
 }
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
index ff8bc0a..e1e3217 100644
--- a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -29,13 +29,13 @@ int main(int argc, char **argv)
     Array1[ii] = 0;
 
 #pragma cilk grainsize = 1 
-  while (Array1[5] != 0) /* { dg-warning "grainsize pragma is not followed" } */
+  while (Array1[5] != 0) /* { dg-warning "is not followed by" } */
     {
     /* Blah */
     }
 
 #pragma cilk grainsize = 1 
-  int q = 0; /* { dg-warning "grainsize pragma is not followed" } */
+  int q = 0; /* { dg-warning "is not followed by" } */
   _Cilk_for (q = 0; q < 10; q++)
     Array1[q]  = 5;
 
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
new file mode 100644
index 0000000..8221371
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
@@ -0,0 +1,25 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define SIZE 100
+#define CHECK_VALUE 5
+
+template <class T>
+int func (T start, T end)
+{
+  int Array[SIZE];
+  _Cilk_for (T ii = 0; ii < end; ii++)
+    Array[ii] = CHECK_VALUE;
+  
+  for (T ii = 0; ii < end; ii++)
+    if (Array[ii] != CHECK_VALUE)
+      __builtin_abort ();
+
+  return 0;
+}
+
+int main (void)
+{
+  return func <int> (0, 100) + func <long> (0, 100);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
new file mode 100644
index 0000000..2ac8c72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
@@ -0,0 +1,52 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array;
+vector <int> array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back (ii);
+}
+#endif
+_Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end();
+          iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+for (vector<int>::iterator iter = array_serial.begin(); 
+     iter != array_serial.end(); iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
new file mode 100644
index 0000000..1cf3301
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
@@ -0,0 +1,72 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array,array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back(ii);
+}
+#endif
+_Cilk_for (vector<int>::reverse_iterator iter4 = array.rbegin(); 
+	   iter4 != array.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+
+_Cilk_for (vector<int>::reverse_iterator iter4 = array_serial.rbegin(); 
+	   iter4 != array_serial.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+_Cilk_for (vector<int>::reverse_iterator iter2 = array.rbegin(); 
+	   iter2 != array.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+for (vector<int>::reverse_iterator iter2 = array_serial.rbegin(); 
+     iter2 != array_serial.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
+
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
new file mode 100644
index 0000000..3e350a1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
@@ -0,0 +1,50 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <iostream>
+#include <cstdio>
+#include <cstdlib>
+#include <vector>
+#include <algorithm>
+#include <list>
+
+using namespace std;
+
+
+int main(int argc, char **argv)
+{
+  vector <int> number_list, number_list_serial;
+  int new_number = 0;
+  int no_elements = 0;
+  
+  if (argc != 2)
+  {
+    no_elements = 10000;
+  }
+
+
+  number_list.clear();
+  number_list_serial.clear();
+  for (int ii = 0; ii < no_elements; ii++)
+  {
+    number_list.push_back(new_number);
+    number_list_serial.push_back(new_number);
+  }
+
+  _Cilk_for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list[jj] = jj + no_elements;
+  }
+  for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list_serial[jj] = jj + no_elements;
+  }
+
+  for (int jj = 0; jj < no_elements; jj++)
+    if (number_list_serial[jj] != number_list[jj])
+      abort ();
+
+  return 0;
+}

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2014-01-08 19:46                                     ` Iyer, Balaji V
@ 2014-01-16 17:29                                       ` Jason Merrill
  2014-01-16 17:39                                         ` Jakub Jelinek
  2014-01-19  4:50                                         ` Iyer, Balaji V
  2014-01-16 21:19                                       ` Aldy Hernandez
  1 sibling, 2 replies; 42+ messages in thread
From: Jason Merrill @ 2014-01-16 17:29 UTC (permalink / raw)
  To: Iyer, Balaji V, Jakub Jelinek
  Cc: 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

On 01/08/2014 02:46 PM, Iyer, Balaji V wrote:
> +  /* Grain value, only used by _Cilk_for.  */
> +  tree grain;

Why can't the grain stay as a clause for the gimple form of the loop?

> +  if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
> +    {
> +      tree it = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), 0);
> +      gimplify_and_add (it, &for_pre_body);
> +    }

Why doesn't the normal handling of OMP_FOR_INIT work for Cilk?  All the 
special cases for CILK_FOR need comments explaining why they are needed.

Also, this seems like you're assigning to the control variable outside 
of the loop, which doesn't makes sense because we initialize it in each 
of the invocations of the child function.  Right?

> +      /* Original initial, final and increment values are necessary to compute
> +	 the loop-count.  Otherwise, they are stored in variables and their
> +	 context could be changed, potentially making it impossible to compute
> +	 them correctly.  */

I don't understand.  Surely all you care about is the value, and 
gimplification shouldn't affect that.

> +	  /* If VAR is the induction variable of the outer _Cilk_for, then
> +	     it needs to be passed as a value not pointer since it
> +	     would not be overwritten by the body.  */

Here it looks like you're overriding the normal logic because we know 
that it's safe to assume the induction variable won't be changed by the 
body of the loop.  But why is the induction variable shared in the first 
place?  If it isn't going to change, it can be private.

> +	/* We cannot use the tsubst_omp_clauses since it will try to
> +	   do checking such as whether a certain clause can be used
> +	   with a certain for-loop.  We are just use schedule clause here
> +	   as a holder to hold the grain value.  */

I don't see the checking you mention.  Can't we fix it to do the right 
thing?

> +  if (code == CILK_FOR)
> +    {
> +      top_level_body = push_stmt_list ();
> +      top_body = begin_omp_parallel ();
> +    }

I wouldn't expect the front end to care that Cilk for is implemented 
using a parallel call; can't we bring that in at lowering time?

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2014-01-16 17:29                                       ` Jason Merrill
@ 2014-01-16 17:39                                         ` Jakub Jelinek
  2014-01-19  4:50                                         ` Iyer, Balaji V
  1 sibling, 0 replies; 42+ messages in thread
From: Jakub Jelinek @ 2014-01-16 17:39 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Iyer, Balaji V, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

On Thu, Jan 16, 2014 at 12:29:44PM -0500, Jason Merrill wrote:
> >+  if (code == CILK_FOR)
> >+    {
> >+      top_level_body = push_stmt_list ();
> >+      top_body = begin_omp_parallel ();
> >+    }
> 
> I wouldn't expect the front end to care that Cilk for is implemented
> using a parallel call; can't we bring that in at lowering time?

The gimplifier already cares, so if it shouldn't be added early in the FE,
it must be added during genericization.
Unless we treat CILK_FOR as implicit parallel in the gimplifier, but I'd say
that it is better to have it expressed as parallel with cilk_for nested in
it and the combined flag set, so that omp-low.c can then emit it together.

	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2014-01-08 19:46                                     ` Iyer, Balaji V
  2014-01-16 17:29                                       ` Jason Merrill
@ 2014-01-16 21:19                                       ` Aldy Hernandez
  2014-01-17  9:24                                         ` Marek Polacek
  2014-01-19  4:53                                         ` Iyer, Balaji V
  1 sibling, 2 replies; 42+ messages in thread
From: Aldy Hernandez @ 2014-01-16 21:19 UTC (permalink / raw)
  To: Iyer, Balaji V, Jakub Jelinek
  Cc: Jason Merrill, 'Jeff Law',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

Here are a few things.

> +      if (g_expr.value && TREE_CODE (g_expr.value) == C_MAYBE_CONST_EXPR)
> +	{
> +	  error_at (input_location, "cannot convert grain to long integer.\n");
> +	  c_parser_skip_to_pragma_eol (parser);
> +	}

Remove final period.  Also, where's the testcase?  Also, there seems to 
be spurious white space after the "}".

Is it required that it be a long integer?  Because I see no further 
checks for this.

> +	  c_token *token = c_parser_peek_token (parser);
> +	  if (token && token->type == CPP_KEYWORD
> +	      && token->keyword == RID_CILK_FOR)

It doesn't look like c_parser_peek_token() ever returns NULL, so no need 
to check for token != 0.

> +	      tree grain = convert_to_integer (long_integer_type_node,
> +					       g_expr.value);
> +	      if (grain && grain != error_mark_node)
> +		c_parser_cilk_simd (parser, grain);

No need to check grain != 0 here either.

> ==> a.c.003t.original <==
>
> ;; Function main (null)
> ;; enabled by -tree-original
>
>
> {
>   int i;
>
>     int i;
>   <<< Unknown tree: cilk_for
>   #pragma omp parallel
>     {
>       {

Found with -fdump-tree-all.  You should handle the cilk_for tree code in 
the pretty printers, and add corresponding test(s).


>  static void
> -c_parser_cilk_simd (c_parser *parser)
> +c_parser_cilk_simd (c_parser *parser, tree grain)

No documentation for grain.

> +  tree clauses = NULL_TREE;
> +
> +  if (!is_cilk_for)
> +    clauses = c_parser_cilk_all_clauses (parser);
> +  else
> +    clauses = grain;

First set of clauses=NULL_TREE is useless.

>  static tree
>  c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
> -		       tree clauses, tree *cclauses)
> +		       tree clauses_or_grain, tree *cclauses)

Don't overload clauses and grainsize into one argument.  Add another 
argument.  Also, document said argument.

> +/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
> +
> +static tree
> +cilk_declare_looper (const char *name, tree type, enum built_in_function code)
> +{

I think you should document that it's creating a suitable built-in, not 
just creating a FUNCTION_DECL.  Also, plesae document argument `code'. 
And call this function something more meaningful, like 
"cilkrts_decalre_builtin" or "cilk_declare_for_builtin", but definitely 
not looper :).

> @@ -1192,6 +1199,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
>  	    case GE_EXPR:
>  	      pp_greater_equal (buffer);
>  	      break;
> +	    case NE_EXPR:
> +	      pp_string (buffer, "!=");
> +	      break;

Thank you :).  That was probably my oversight on the pragma simd work.

> @@ -6603,6 +6614,11 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
>      }
>
>    for_body = NULL;
> +  if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
> +    {
> +      tree it = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), 0);
> +      gimplify_and_add (it, &for_pre_body);
> +    }

And what Jason said for all the special casing for CILK_FOR in this 
function...

> +static inline void
> +gimple_cilk_for_set_grain (tree grain, gimple gs)
> +{
> +  const gimple_statement_omp_for *omp_for_stmt =
> +    as_a <gimple_statement_omp_for> (gs);
> +  omp_for_stmt->iter[0].grain = grain;
> +}

Can we leave the grainsize in the clause, or does it have to reside in 
an auxiliary data structure?  It seems weird as is, since you're only 
setting grain for the first element.  I think Jason mentioned something 
similar.

> +/* A structure with necessary elements from _Cilk_for statement.  This
> +   struct. node is passed in to WALK_STMT_INFO->INFO.  */
> +typedef struct cilk_for_information {

"{" should be in a separate line.

"for a Cilk_for statement", not "from Cilk_for".  No abbreviation on 
struct.  Either remove the period, or spell it out.  Also s/in to/to/.

I'm not a C++ expert, but my understanding was that in C++ you don't 
need a typedef to use the following structure by name 
(cilk_for_information).  So you can just declare "struct 
cilk_for_information {...}" and instantiate it with just 
"cilk_for_information some_instance".  If that's the case, get rid of 
typedef.

> +  if (flag_enable_cilkplus

BTW, weren't you going to change this to flag_cilkplus or something in 
some past follow-up?

>   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
>   fd->chunk_size = NULL_TREE;
> +  if (flag_enable_cilkplus
> +      && gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
> +    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;

I believe most of the flag_enable_cilkplus checks in omp-low.c can be 
removed, especially the ones related to syntax.  You shouldn't be 
getting any cilk constructs this late if cilkplus was not enabled.  For 
that matter, we don't check flag_openmp in this file throughout, we only 
check for it at the gates.

>   bool is_cilk_for = (flag_enable_cilkplus && outer_ctx
> 		      && is_cilk_for_stmt (outer_ctx->stmt, &ind_var));

Although here you could probably leave it, since it would avoid 
traversing outer_ctx->stmt.  And speak of which, since you're checking 
flag_enable_cilkplus in the caller, why also check it in 
is_cilk_for_stmt?  Do it all in the callee IMO.

Also, you should probably combine boths initializations of sched_kind 
into the same if:

if (gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
else
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;

> +/* Returns the type of the induction variable for the child function for
> +   _Cilk_for and the types for _high and _low variables based on TYPE.  */

high, low, type, what?  I don't get it.  Can you rewrite the 
documentation to make it clearer what this does?

> +
> +static tree
> +cilk_for_check_loop_diff_type (tree type)
> +{
> +  if (type == integer_type_node)
> +    return type;
> +  else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node))
> +    {
> +      if (TYPE_UNSIGNED (type))
> +	return uint32_type_node;
> +      else
> +	return integer_type_node;
> +    }
> +  else
> +    {
> +      if (TYPE_UNSIGNED (type))
> +	return uint64_type_node;
> +      else
> +	return long_long_integer_type_node;
> +    }
> +  gcc_unreachable ();
> +}

No need for a gcc_unreachable().  You have a final `else'; you'll 
clearly never reach the unreachable.

>  static void
> -create_omp_child_function (omp_context *ctx, bool task_copy)
> +create_omp_child_function (omp_context *ctx, bool task_copy,
> +			   bool is_cilk_for, tree cilk_var_type)
>  {
>    tree decl, type, name, t;
> -
> -  name = create_omp_child_function_name (task_copy);
> +
> +  name = create_omp_child_function_name (task_copy, is_cilk_for);

It looks like task_copy and is_cilk_for never coexist throughout. 
Perhaps this is a candidate for an enum?

> +  /* _Cilk_for's child function requires two extra parameters called
> +     __low and __high that are set the by Cilk runtime when it calls this
> +     function.  */
> +  if (is_cilk_for)
> +    {
> +      t = build_decl (DECL_SOURCE_LOCATION (decl),
> +		      PARM_DECL, get_identifier ("__high"), cilk_var_type);

Perhaps you should add a similar comment here too:

> +  else if (is_cilk_for)
> +    type = build_function_type_list (void_type_node, ptr_type_node,
> +				     cilk_var_type, cilk_var_type, NULL_TREE);


> +	      /* In _Cilk_for, the increment, start and final values
> +		 are stored in the clause inserted by gimplify_omp_for.
> +		 This value is used by the child function to find the
> +		 appropriate induction value function based on the
> +		 high and low parameters of the child function.
> +		 Now, we need to store the decl value expressions here so
> +		 that we can easily access them.  */
> +	      if (flag_enable_cilkplus
> +		  && (is_cilk_loop_var (var, "__cilk_init")
> +		      || is_cilk_loop_var (var, "__cilk_cond")
> +		      || is_cilk_loop_var (var, "__cilk_incr")))
> +		SET_DECL_VALUE_EXPR (var, x);

This looks weird.  I'll let Jakub and Jason comment on whether this is 
the correct place to put this information.  Can't you check somehow if 
the loop is a Cilk_for loop without having to look at the variable names 
themselves?

> +/* Returns true if T is a tree whose code is COMPONENT_REF and its field
> +   matches D_F_NAME and the data argument matches D_ARG_NAME.  */
> +
> +static bool
> +cilk_find_field_value (tree t, tree d_arg_name, tree d_f_name)
> +{
> +  if (TREE_CODE (t) == COMPONENT_REF)
> +    {
> +      tree arg = TREE_OPERAND (t, 0);
> +      tree field = TREE_OPERAND (t, 1);
> +      if (TREE_CODE (arg) == ADDR_EXPR || TREE_CODE (arg) == MEM_REF)
> +	arg = TREE_OPERAND (arg, 0);
> +      if (DECL_NAME (arg) && DECL_NAME (field)
> +	  && !strcmp (IDENTIFIER_POINTER (d_arg_name),
> +		      IDENTIFIER_POINTER (DECL_NAME (arg)))
> +	  && !strcmp (IDENTIFIER_POINTER (d_f_name),
> +		      IDENTIFIER_POINTER (DECL_NAME (field))))

Also weird.  Do we really need to look at the identifier pointer itself? 
  Again, I'll let Jason and/or Jakub comment.

I'll let Jakub comment on the functional parts of the omp-low.c parts, 
but I would prefer that a lot of big functions in omp-low.c that only 
pertain to Cilk Plus, be moved to a cilk specific file.  For example, 
expand_cilk_for_body() and helpers.

Aldy

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2014-01-16 21:19                                       ` Aldy Hernandez
@ 2014-01-17  9:24                                         ` Marek Polacek
  2014-01-19  4:53                                         ` Iyer, Balaji V
  1 sibling, 0 replies; 42+ messages in thread
From: Marek Polacek @ 2014-01-17  9:24 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: Iyer, Balaji V, Jakub Jelinek, Jason Merrill, 'Jeff Law',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

On Thu, Jan 16, 2014 at 01:18:59PM -0800, Aldy Hernandez wrote:
> I'm not a C++ expert, but my understanding was that in C++ you don't
> need a typedef to use the following structure by name
> (cilk_for_information).  So you can just declare "struct
> cilk_for_information {...}" and instantiate it with just
> "cilk_for_information some_instance".  If that's the case, get rid
> of typedef.

Yes.  That's what create_implicit_typedef does.

	Marek

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2014-01-16 17:29                                       ` Jason Merrill
  2014-01-16 17:39                                         ` Jakub Jelinek
@ 2014-01-19  4:50                                         ` Iyer, Balaji V
  2014-01-23 10:12                                           ` Jakub Jelinek
  1 sibling, 1 reply; 42+ messages in thread
From: Iyer, Balaji V @ 2014-01-19  4:50 UTC (permalink / raw)
  To: Jason Merrill, Jakub Jelinek
  Cc: 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

[-- Attachment #1: Type: text/plain, Size: 4149 bytes --]

Hi Jason,
	I have answered your questions below. In addition to your changes, I have also fixed the issues Aldy pointed out and have answered his questions in that thread.  With this email I have attached two patches and 2 change-logs (for C and C++). I have also rebased these patches to the trunk revision (r206756)

Thanks,

Balaji V. Iyer.

> -----Original Message-----
> From: Jason Merrill [mailto:jason@redhat.com]
> Sent: Thursday, January 16, 2014 12:30 PM
> To: Iyer, Balaji V; Jakub Jelinek
> Cc: 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org'; 'rth@redhat.com'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On 01/08/2014 02:46 PM, Iyer, Balaji V wrote:
> > +  /* Grain value, only used by _Cilk_for.  */  tree grain;
> 
> Why can't the grain stay as a clause for the gimple form of the loop?
> 

Ok. Removed the field and modified the code accordingly.

> > +  if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
> > +    {
> > +      tree it = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), 0);
> > +      gimplify_and_add (it, &for_pre_body);
> > +    }
> 
> Why doesn't the normal handling of OMP_FOR_INIT work for Cilk?  All the
> special cases for CILK_FOR need comments explaining why they are needed.
> 
Removed. It was an old artifact

> Also, this seems like you're assigning to the control variable outside of the
> loop, which doesn't makes sense because we initialize it in each of the
> invocations of the child function.  Right?
> 

I am just saving the values to a temporary variable to hold the 

> > +      /* Original initial, final and increment values are necessary to compute
> > +	 the loop-count.  Otherwise, they are stored in variables and their
> > +	 context could be changed, potentially making it impossible to
> compute
> > +	 them correctly.  */
> 
> I don't understand.  Surely all you care about is the value, and gimplification
> shouldn't affect that.
>
> > +	  /* If VAR is the induction variable of the outer _Cilk_for, then
> > +	     it needs to be passed as a value not pointer since it
> > +	     would not be overwritten by the body.  */
> 
> Here it looks like you're overriding the normal logic because we know that it's
> safe to assume the induction variable won't be changed by the body of the
> loop.  But why is the induction variable shared in the first place?  If it isn't
> going to change, it can be private.
> 

In OMP, when we have nested fors induction variable is passed in as an address. For us we just need it passed as value since we modify it. The original induction variable is replaced (in value) by the new induction variable inserted in the child function body. This is done here to fix that.

> > +	/* We cannot use the tsubst_omp_clauses since it will try to
> > +	   do checking such as whether a certain clause can be used
> > +	   with a certain for-loop.  We are just use schedule clause here
> > +	   as a holder to hold the grain value.  */
> 
> I don't see the checking you mention.  Can't we fix it to do the right thing?
> 

This again was an artifact. This is removed also.

> > +  if (code == CILK_FOR)
> > +    {
> > +      top_level_body = push_stmt_list ();
> > +      top_body = begin_omp_parallel ();
> > +    }
> 
> I wouldn't expect the front end to care that Cilk for is implemented using a
> parallel call; can't we bring that in at lowering time?
> 

The way I have implemented _Cilk_for is like a for-loop where the for-statement excluding the body is like a #pragma omp for and the body itself is #pragma omp parallel. This is sort of backwards from the OMP's for loops. This way pulling body into a child function, etc. is done by the existing infrastructure. The child function is modified accordingly to fit _Cilk_for in a function called expand_cilk_for_body that is called in expand_omp_taskreg . During expand_omp_for we just insert the call to the runtime function.

I am sure there are other ways to do this, but this seem to be the most straightforward way to do this with the least amount of code-change/duplication.

> Jason





[-- Attachment #2: c-ChangeLog --]
[-- Type: application/octet-stream, Size: 4280 bytes --]

gcc/ChangeLog
2014-01-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* cilk-builtins.def: Added two new builtin functions called
	__cilkrts_cilk_for_32 and __cilkrts_cilk_for_64.
	* cilk-common.c (cilk_init_builtins): Likewise.
	(cilk_declare_looper): New function.
	* cilk.h (enum cilk_tree_index): Added two new fields called
	CILK_TI_F_LOOP_32 and CILK_TI_F_LOOP_64.
	(cilk_for_32_fndecl): New #define.
	(cilk_for_64_fndecl): Likewise.
	* gimple-pretty-print.c (dump_gimple_omp_for): Added a new case for
	GF_OMP_FOR_KIND_CILKFOR.  Also emitted "_Cilk_for" instead of "for ("
	for when the gimple kind is GF_OMP_FOR_KIND_CILKFOR.
	* gimple.h (enum gf_mask): Added a new field GF_OMP_FOR_KIND_CILKFOR.
	Re-arranged couple other fields to make them all in ascending order.
	(struct gimple_omp_for_iter): Added a new field called "grain."
	(gimple_cilk_for_set_grain): New function.
	(gimple_cilk_for_induction_var): Likewise.
	(gimple_cilk_for_grain): Likewise.
	* gimplify.c (gimplify_omp_for): Added code to handle gimplification
	of a _Cilk_for statement.
	* omp-low.c (struct cilk_for_information): New structure.
	(create_omp_child_function_name): Added a new bool parameter called
	is_cilk_for.  If this is set, then use a different suffix.
	(extract_omp_for_data): Added a check for _Cilk_for's kind for a
	NE_EXPR case.  Added the correct schedule type for _Cilk_for.
	(use_pointer_for_field): Reject using of pointers for the induction
	variable of the outer function.
	(find_cilk_for_stmt): New function.
	(is_cilk_for_stmt): Likewise.
	(cilk_for_check_loop_diff_type): Likewise.
	(expand_cilk_for_body): Likewise.
	(is_cilk_loop_var): Likewise.
	(cilk_find_field_value): Likewise.
	(cilk_find_component_expr): Likewise.
	(find_cilk_for_vars): Likewise.
	(insert_cilk_for_fn_call): Likewise.
	(create_omp_child_function): Added two new parameters to pass in
	whether it is a _Cilk_for body and the induction variable type.  If
	it is _Cilk_for, then create two new parameters and different function-
	type.
	(lower_rec_input_clauses): Set the new decl expr value to the
	variable for the "__cilk_init," "__cilk_cond" and "__cilk_incr"
	variables.
	(scan_omp_parallel): Added a check if the outer statement is a
	_Cilk_for and if so, then find the correct induction variable type to
	pass them into create_omp_child_function.
	(expand_omp_taskreg): Added code to extract the high and low parameters
	from the child function and then insert it in the appropriate location.
	Added a call to expand_cilk_for_body.  Allowed the insertion of the
	library calls when the taskreg being expanded is not a _Cilk_for.
	(expand_omp_for): Added a check for GF_OMP_FOR_KIND_CILKFOR for the
	for statement's kind.  If so then call insert_cilk_for_fn_call.
	* tree-core.h (enum omp_clause_schedule_kind): Added a new field
	OMP_CLAUSE_SCHEDULE_CILK_FOR.
	* tree.def (CILK_FOR): New tree.

gcc/c-family/ChangeLog
2014-01-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-omp.c (c_finish_omp_for): Added a check for CILK_FOR along with
	CILK_SIMD.
	* c-common.h (enum rid): Added new value called "RID_CILK_FOR."
	* c-common.c (c_common_reswords[]): Added a new field "_Cilk_for."
	* c-pragma.c (init_pragma): Added cilk grainsize pragma.
	* c-pragma.h (enum pragma_kind): Added PRAGMA_CILK_GRAINSIZE.

gcc/c/ChangeLog
2014-01-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-parser.c (c_parser_statement_after_labels): Added a RID_CILK_FOR
	case.
	(c_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(c_parser_omp_for_loop): Added new parameter called grain.
	Added handling for _Cilk_for statements.
	(c_parser_cilk_grainsize): New function.
	(c_parser_cilk_simd): Added a new parameter called grain.  Also added
	support to parse _Cilk_for statements.

gcc/testsuite/ChangeLog
2014-01-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk-fors.c: New testcase.
	* c-c++-common/cilk-plus/CK/nested_cilk_for.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c: Likewise.

[-- Attachment #3: cp-ChangeLog --]
[-- Type: application/octet-stream, Size: 1437 bytes --]

gcc/cp/ChangeLog
2014-01-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* parser.c (cp_parser_statement): Added a RID_CILK_FOR case.
	(cp_parser_omp_for_cond): Included a check for CILK_FOR along with
	CILK_SIMD.
	(cp_parser_omp_for_loop): Overall, added support to parse _Cilk_for
	statement along with omp for statements.
	(cp_parser_cilk_grainsize): New function.
	(cp_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(cp_parser_cilk_simd): Added a new parameter for grain.  Added support
	to handle _Cilk_for loops along with #pragma simd for loops.
	* pt.c (tsubst_expr): Added CILK_FOR case.
	* semantics.c (handle_omp_for_class_iterator): Added 2 new parameters.
	Added a NE_EXPR case.  Added a check for _Cilk_for statement and
	if so, then give a name for the new induction variable.
	(finish_omp_for): Added a check if the code is _Cilk_for and if true
	then insert all the iterator temporary variables into the _Cilk_for
	body.

gcc/testsuite/ChangeLog
2014-01-18  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Made certain error
	tags C specific and inserted their C++ equivalents.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* g++.dg/cilk-plus/CK/cilk-for-tplt.cc: New testcase.
	* g++.dg/cilk-plus/CK/stl_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_rev_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_test.cc: Likewise.


[-- Attachment #4: diff_c.txt --]
[-- Type: text/plain, Size: 65344 bytes --]

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
old mode 100644
new mode 100755
index 35958ea..9135030
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -414,6 +414,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Complex",		RID_COMPLEX,	0 },
   { "_Cilk_spawn",      RID_CILK_SPAWN, 0 },
   { "_Cilk_sync",       RID_CILK_SYNC,  0 },
+  { "_Cilk_for",        RID_CILK_FOR,   0 },
   { "_Imaginary",	RID_IMAGINARY, D_CONLY },
   { "_Decimal32",       RID_DFLOAT32,  D_CONLY | D_EXT },
   { "_Decimal64",       RID_DFLOAT64,  D_CONLY | D_EXT },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7e3ece6..0eaebf3 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -149,7 +149,7 @@ enum rid
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
   /* Cilk Plus keywords.  */
-  RID_CILK_SPAWN, RID_CILK_SYNC,
+  RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR,
   
   /* Objective-C ("AT" reserved words - they are only keywords when
      they follow '@')  */
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 4ce51e4..bb4f6a1 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -397,7 +397,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
   bool fail = false;
   int i;
 
-  if (code == CILK_SIMD
+  if ((code == CILK_SIMD || code == CILK_FOR) 
       && !c_check_cilk_loop (locus, TREE_VEC_ELT (declv, 0)))
     fail = true;
 
@@ -527,7 +527,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 					    0))
 		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
 					 ? LT_EXPR : GE_EXPR);
-		  else if (code != CILK_SIMD)
+		  else if (code != CILK_SIMD && code != CILK_FOR)
 		    cond_ok = false;
 		}
 	    }
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index af28085..6f22148 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1394,6 +1394,11 @@ init_pragma (void)
 
   cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false,
 				false);
+
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    cpp_register_deferred_pragma (parse_in, "cilk", "grainsize",
+				  PRAGMA_CILK_GRAINSIZE, true, false);
+
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, "pack", handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 6f1bf74..b9f09ba 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -55,6 +55,9 @@ typedef enum pragma_kind {
   /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
   PRAGMA_CILK_SIMD,
 
+  /* This pragma handles setting of grainsize for a _Cilk_for.  */
+  PRAGMA_CILK_GRAINSIZE,
+
   PRAGMA_GCC_PCH_PREPROCESS,
   PRAGMA_IVDEP,
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
old mode 100644
new mode 100755
index 4490210..10d5f6e
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1248,10 +1248,11 @@ static bool c_parser_objc_diagnose_bad_element_prefix
   (c_parser *, struct c_declspecs *);
 
 /* Cilk Plus supporting routines.  */
-static void c_parser_cilk_simd (c_parser *);
+static void c_parser_cilk_simd (c_parser *, tree);
 static bool c_parser_cilk_verify_simd (c_parser *, enum pragma_context);
 static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
 static tree c_parser_cilk_clause_vectorlength (c_parser *, tree, bool);
+static void c_parser_cilk_grainsize (c_parser *);
 
 /* Parse a translation unit (C90 6.7, C99 6.9).
 
@@ -4878,6 +4879,16 @@ c_parser_statement_after_labels (c_parser *parser)
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
 	  break;
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (c_parser_peek_token (parser)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      c_parser_skip_to_end_of_block_or_statement (parser);
+	    }
+	  else
+	    c_parser_cilk_simd (parser, integer_zero_node);
+	  break;
 	case RID_CILK_SYNC:
 	  c_parser_consume_token (parser);
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
@@ -9488,7 +9499,25 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       if (!c_parser_cilk_verify_simd (parser, context))
 	return false;
       c_parser_consume_pragma (parser);
-      c_parser_cilk_simd (parser);
+      c_parser_cilk_simd (parser, NULL_TREE);
+      return false;
+
+    case PRAGMA_CILK_GRAINSIZE:
+      if (!flag_enable_cilkplus)
+	{
+	  warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not"
+		   " enabled");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      if (context == pragma_external)
+	{
+	  error_at (c_parser_peek_token (parser)->location,
+		    "%<#pragma grainsize%> must be inside a function");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      c_parser_cilk_grainsize (parser);
       return false;
 
     default:
@@ -11583,7 +11612,7 @@ c_parser_omp_flush (c_parser *parser)
 
 static tree
 c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
-		       tree clauses, tree *cclauses)
+		       tree clauses, tree grain, tree *cclauses)
 {
   tree decl, cond, incr, save_break, save_cont, body, init, stmt, cl;
   tree declv, condv, incrv, initv, ret = NULL;
@@ -11591,6 +11620,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   int i, collapse = 1, nbraces = 0;
   location_t for_loc;
   vec<tree, va_gc> *for_block = make_tree_vector ();
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
 
   for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl))
     if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE)
@@ -11603,11 +11633,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   condv = make_tree_vec (collapse);
   incrv = make_tree_vec (collapse);
 
-  if (!c_parser_next_token_is_keyword (parser, RID_FOR))
+  if (code != CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_FOR))
     {
       c_parser_error (parser, "for statement expected");
       return NULL;
     }
+  if (code == CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_CILK_FOR))
+    {
+      c_parser_error (parser, "_Cilk_for statement expected");
+      return NULL;
+    }
   for_loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
 
@@ -11685,7 +11722,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	    case LE_EXPR:
 	      break;
 	    case NE_EXPR:
-	      if (code == CILK_SIMD)
+	      if (code == CILK_SIMD || code == CILK_FOR)
 		break;
 	      /* FALLTHRU.  */
 	    default:
@@ -11767,6 +11804,12 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
     c_break_label = size_one_node;
   save_cont = c_cont_label;
   c_cont_label = NULL_TREE;
+
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = c_begin_omp_parallel ();
+    }
   body = push_stmt_list ();
 
   if (open_brace_parsed)
@@ -11814,6 +11857,13 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	}
     }
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = c_finish_omp_parallel (loc, NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
+
   /* Only bother calling c_finish_omp_for if we haven't already generated
      an error from the initialization parsing.  */
   if (!fail)
@@ -11859,6 +11909,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 		  }
 	    }
 	  OMP_FOR_CLAUSES (stmt) = clauses;
+	  /* If it is a _Cilk_for statement, then the OMP_FOR_CLAUSES location
+	     stores the user-defined grain value or an integer_zero_node 
+	     indicating that the runtime must compute a suitable grain.  */
+	  if (code == CILK_FOR)
+	    {
+	      tree l = build_omp_clause (EXPR_LOCATION (grain),
+					 OMP_CLAUSE_SCHEDULE);
+	      OMP_CLAUSE_SCHEDULE_KIND (l) = OMP_CLAUSE_SCHEDULE_RUNTIME;
+	      OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (l) = grain;
+	      OMP_CLAUSE_CHAIN (l) = NULL_TREE;
+	      OMP_FOR_CLAUSES (stmt) = l;
+	    }
 	}
       ret = stmt;
     }
@@ -11923,7 +11985,8 @@ c_parser_omp_simd (location_t loc, c_parser *parser,
     }
 
   block = c_begin_compound_stmt (true);
-  ret = c_parser_omp_for_loop (loc, parser, OMP_SIMD, clauses, cclauses);
+  ret = c_parser_omp_for_loop (loc, parser, OMP_SIMD, clauses, NULL_TREE,  
+			       cclauses);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 
@@ -12003,7 +12066,8 @@ c_parser_omp_for (location_t loc, c_parser *parser,
     }
 
   block = c_begin_compound_stmt (true);
-  ret = c_parser_omp_for_loop (loc, parser, OMP_FOR, clauses, cclauses);
+  ret = c_parser_omp_for_loop (loc, parser, OMP_FOR, clauses,  NULL_TREE, 
+			       cclauses);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 
@@ -12486,7 +12550,8 @@ c_parser_omp_distribute (location_t loc, c_parser *parser,
     }
 
   block = c_begin_compound_stmt (true);
-  ret = c_parser_omp_for_loop (loc, parser, OMP_DISTRIBUTE, clauses, NULL);
+  ret = c_parser_omp_for_loop (loc, parser, OMP_DISTRIBUTE, clauses, NULL_TREE,
+			       NULL);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 
@@ -13762,16 +13827,63 @@ c_parser_cilk_all_clauses (c_parser *parser)
   return c_finish_cilk_clauses (clauses);
 }
 
+/* This function helps parse the grainsize pragma for a _Cilk_for statement. 
+   Here is the correct syntax of this pragma: 
+	    #pragma cilk grainsize = <EXP>  */
+
+static void
+c_parser_cilk_grainsize (c_parser *parser)
+{
+  extern tree convert_to_integer (tree, tree);
+
+  /* consume the 'grainsize' keyword.  */
+  c_parser_consume_pragma (parser);
+
+  if (c_parser_require (parser, CPP_EQ, "expected %<=%>"))
+    {
+      struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL);
+      if (g_expr.value && g_expr.value != error_mark_node)
+	{
+	  c_parser_skip_to_pragma_eol (parser);
+	  c_token *token = c_parser_peek_token (parser);
+	  if (token->type == CPP_KEYWORD && token->keyword == RID_CILK_FOR)
+	    {
+	      /* Remove the excess precision expression wrapper since
+		 we are going to convert it to long int.  */
+	      if (TREE_CODE (g_expr.value) == EXCESS_PRECISION_EXPR)
+		g_expr.value = TREE_OPERAND (g_expr.value, 0);
+	      tree grain = convert_to_integer (long_integer_type_node,
+					       g_expr.value);
+	      if (grain != error_mark_node) 
+		c_parser_cilk_simd (parser, grain);
+	    }
+	  else
+	    warning (0, "grainsize pragma is not followed by %<_Cilk_for%>");
+	}
+      else
+	c_parser_skip_to_pragma_eol (parser);
+    }
+  else
+    c_parser_skip_to_pragma_eol (parser);
+}
+
 /* Main entry point for parsing Cilk Plus <#pragma simd> for
-   loops.  */
+   loops.  Grain is the analogous to chunk-size that is passed in
+   by the user through a grainize pragma for _Cilk_for.  If the
+   value is zero, then the runtime computes an appropriate value.  */
 
 static void
-c_parser_cilk_simd (c_parser *parser)
+c_parser_cilk_simd (c_parser *parser, tree grain)
 {
-  tree clauses = c_parser_cilk_all_clauses (parser);
+  bool is_cilk_for = grain == NULL_TREE ? false : true;
+  tree clauses = NULL_TREE;
+  
+  if (!is_cilk_for)
+    clauses = c_parser_cilk_all_clauses (parser);
   tree block = c_begin_compound_stmt (true);
   location_t loc = c_parser_peek_token (parser)->location;
-  c_parser_omp_for_loop (loc, parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  c_parser_omp_for_loop (loc, parser, code, clauses, grain, NULL);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 }
diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def
index 9f3240a..bf319d5 100644
--- a/gcc/cilk-builtins.def
+++ b/gcc/cilk-builtins.def
@@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64")
diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c
index afe88c9..753e5f0 100644
--- a/gcc/cilk-common.c
+++ b/gcc/cilk-common.c
@@ -106,6 +106,27 @@ install_builtin (const char *name, tree fntype, enum built_in_function code,
   return fndecl;
 }
 
+/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
+
+static tree
+declare_cilk_for_builtin (const char *name, tree type,
+			  enum built_in_function code)
+{
+  tree cb, ft, fn;
+
+  cb = build_function_type_list (void_type_node,
+                                 ptr_type_node, type, type,
+                                 NULL_TREE);
+  cb = build_pointer_type (cb);
+  ft = build_function_type_list (void_type_node,
+                                 cb, ptr_type_node, type,
+                                 integer_type_node, NULL_TREE);
+  fn = install_builtin (name, ft, code, false);
+  TREE_NOTHROW (fn) = 0;
+
+  return fn;
+}
+
 /* Creates and initializes all the built-in Cilk keywords functions and three
    structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker.
    Detailed information about __cilkrts_stack_frame and
@@ -269,6 +290,15 @@ cilk_init_builtins (void)
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 
 					 fptr_fun, BUILT_IN_CILK_SAVE_FP,
 					 false);
+  /* __cilkrts_cilk_for_32 (...);  */
+  cilk_for_32_fndecl = declare_cilk_for_builtin ("__cilkrts_cilk_for_32",
+						 unsigned_intSI_type_node,
+						 BUILT_IN_CILK_FOR_32);
+  /* __cilkrts_cilk_for_64 (...);  */
+  cilk_for_64_fndecl = declare_cilk_for_builtin ("__cilkrts_cilk_for_64",
+						 unsigned_intDI_type_node,
+						 BUILT_IN_CILK_FOR_64);
+
 }
 
 /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR.  */
diff --git a/gcc/cilk.h b/gcc/cilk.h
index d2ae931..0e98998 100644
--- a/gcc/cilk.h
+++ b/gcc/cilk.h
@@ -40,6 +40,9 @@ enum cilk_tree_index  {
   CILK_TI_F_POP,                      /* __cilkrts_pop_frame (...).  */
   CILK_TI_F_RETHROW,                  /* __cilkrts_rethrow (...).  */
   CILK_TI_F_SAVE_FP,                  /* __cilkrts_save_fp_ctrl_state (...).  */
+  CILK_TI_F_LOOP_32,                  /* __cilkrts_cilk_for_32 (...).  */
+  CILK_TI_F_LOOP_64,                  /* __cilkrts_cilk_for_64 (...).  */
+
   /* __cilkrts_stack_frame struct fields.  */
   CILK_TI_FRAME_FLAGS,                /* stack_frame->flags.  */
   CILK_TI_FRAME_PARENT,               /* stack_frame->parent.  */
@@ -77,6 +80,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 #define cilk_rethrow_fndecl           cilk_trees[CILK_TI_F_RETHROW]
 #define cilk_pop_fndecl               cilk_trees[CILK_TI_F_POP]
 #define cilk_save_fp_fndecl           cilk_trees[CILK_TI_F_SAVE_FP]
+#define cilk_for_32_fndecl            cilk_trees[CILK_TI_F_LOOP_32]
+#define cilk_for_64_fndecl            cilk_trees[CILK_TI_F_LOOP_64]
 
 #define cilk_worker_type_fndecl       cilk_trees[CILK_TI_WORKER_TYPE]
 #define cilk_frame_type_decl          cilk_trees[CILK_TI_FRAME_TYPE]
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 2d1e1c7..1e7bebf 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1158,6 +1158,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_DISTRIBUTE:
 	  pp_string (buffer, "#pragma omp distribute");
 	  break;
+	case GF_OMP_FOR_KIND_CILKFOR:
+	  gcc_assert (flag_enable_cilkplus);
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -1167,7 +1170,11 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	  if (i)
 	    spc += 2;
 	  newline_and_indent (buffer, spc);
-	  pp_string (buffer, "for (");
+	  if (flag_enable_cilkplus 
+	      && gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
+	    pp_string (buffer, "_Cilk_for (");
+	  else
+	    pp_string (buffer, "for (");
 	  dump_generic_node (buffer, gimple_omp_for_index (gs, i), spc,
 			     flags, false);
 	  pp_string (buffer, " = ");
@@ -1192,6 +1199,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	    case GE_EXPR:
 	      pp_greater_equal (buffer);
 	      break;
+	    case NE_EXPR:
+	      pp_string (buffer, "!=");
+	      break;
 	    default:
 	      gcc_unreachable ();
 	    }
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 0e80d2e..194045c 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -91,13 +91,14 @@ enum gf_mask {
     GF_CALL_ALLOCA_FOR_VAR	= 1 << 5,
     GF_CALL_INTERNAL		= 1 << 6,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
-    GF_OMP_FOR_KIND_MASK	= 3 << 0,
+    GF_OMP_FOR_KIND_MASK	= 7 << 0,
     GF_OMP_FOR_KIND_FOR		= 0 << 0,
     GF_OMP_FOR_KIND_DISTRIBUTE	= 1 << 0,
     GF_OMP_FOR_KIND_SIMD	= 2 << 0,
     GF_OMP_FOR_KIND_CILKSIMD	= 3 << 0,
-    GF_OMP_FOR_COMBINED		= 1 << 2,
-    GF_OMP_FOR_COMBINED_INTO	= 1 << 3,
+    GF_OMP_FOR_KIND_CILKFOR     = 4 << 0,
+    GF_OMP_FOR_COMBINED		= 1 << 3,
+    GF_OMP_FOR_COMBINED_INTO	= 1 << 4,
     GF_OMP_TARGET_KIND_MASK	= 3 << 0,
     GF_OMP_TARGET_KIND_REGION	= 0 << 0,
     GF_OMP_TARGET_KIND_DATA	= 1 << 0,
@@ -4563,6 +4564,16 @@ gimple_omp_for_set_pre_body (gimple gs, gimple_seq pre_body)
   omp_for_stmt->pre_body = pre_body;
 }
 
+/* Returns the induction variable of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_induction_var (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->index;
+}
 
 /* Return the clauses associated with OMP_PARALLEL GS.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
old mode 100644
new mode 100755
index 9c9998d..c671c98
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6559,6 +6559,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
   bool simd;
   bitmap has_decl_expr = NULL;
 
+  tree orig_init = NULL_TREE, orig_cond = NULL_TREE, orig_incr = NULL_TREE;
   orig_for_stmt = for_stmt = *expr_p;
 
   simd = TREE_CODE (for_stmt) == OMP_SIMD
@@ -6678,6 +6679,11 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       else
 	var = decl;
 
+      /* Original initial, final and increment values are necessary to compute
+	 the loop-count.  Otherwise, they are stored in variables and their
+	 context could be changed, potentially making it impossible to compute
+	 them correctly.  */
+      orig_init = TREE_OPERAND (t, 1);
       tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 			    is_gimple_val, fb_rvalue);
       ret = MIN (ret, tret);
@@ -6689,10 +6695,22 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gcc_assert (COMPARISON_CLASS_P (t));
       gcc_assert (TREE_OPERAND (t, 0) == decl);
 
-      tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-			    is_gimple_val, fb_rvalue);
-      ret = MIN (ret, tret);
-
+      /* Make sure the original end-value is saved un-touched for _Cilk_for.
+	 In C++ templates, it modifies the condition value but keeps the
+	 init. value the same.  Since we are using both to compute loop-count
+	 we need to keep them both in the original condtion.  */
+      if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
+	{
+	  int x = 1;
+	  orig_cond = TREE_OPERAND (t, 1);
+	  copy_tree_r (&orig_cond, &x, NULL);
+	}
+      else
+	{
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL, 
+				is_gimple_val, fb_rvalue);
+	  ret = MIN (ret, tret);
+	}
       /* Handle OMP_FOR_INCR.  */
       t = TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i);
       switch (TREE_CODE (t))
@@ -6713,6 +6731,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	    t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	    t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	    TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	    orig_incr = build_one_cst (TREE_TYPE (t));
 	    break;
 	  }
 
@@ -6726,6 +6745,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	  orig_incr = build_one_cst (TREE_TYPE (t));
 	  break;
 
 	case MODIFY_EXPR:
@@ -6753,8 +6773,15 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	      gcc_unreachable ();
 	    }
 
-	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-				is_gimple_val, fb_rvalue);
+	  orig_incr = TREE_OPERAND (t, 1);
+	  /* Extract the absolute value of the increment.  */
+	  if (TREE_CODE (t) == MINUS_EXPR
+	      || TREE_CODE  (TREE_OPERAND (t, 1)) == NEGATE_EXPR
+	      || (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST
+		  && tree_int_cst_sgn (TREE_OPERAND (t, 1)) < 1))
+	    orig_incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), orig_incr);
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body,
+				NULL, is_gimple_val, fb_rvalue);
 	  ret = MIN (ret, tret);
 	  if (c)
 	    {
@@ -6802,8 +6829,60 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 
   BITMAP_FREE (has_decl_expr);
 
+  /* Save the original, final and step-size value into a variable.  */
+  tree incr_val = NULL_TREE, init_val = NULL_TREE, cond_val = NULL_TREE;
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      tree stmt_list = alloc_stmt_list ();
+      incr_val = create_tmp_var (TREE_TYPE (orig_incr), "__cilk_incr");
+      tree mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_incr), incr_val,
+			 orig_incr);
+      append_to_statement_list (mod, &stmt_list);
+
+      init_val = create_tmp_var (TREE_TYPE (orig_init), "__cilk_init");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_init), init_val, orig_init);
+      append_to_statement_list (mod, &stmt_list);
+
+      cond_val = create_tmp_var (TREE_TYPE (orig_cond), "__cilk_cond");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_cond), cond_val, orig_cond);
+      append_to_statement_list (mod, &stmt_list);
+  
+      gimplify_and_add (stmt_list, &for_pre_body);
+    }
   gimplify_and_add (OMP_FOR_BODY (orig_for_stmt), &for_body);
 
+  /* Set the variables holding initial, final and step-size as shared and
+     insert them as clauses.  */
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      /* Sometimes an assign is inserted before the OMP_FOR_BODY.  So,
+	 search and find the omp for body.  */
+      gimple for_body_stmt = NULL;
+      for (gimple_stmt_iterator gsi = gsi_start (for_body); !gsi_end_p (gsi);
+	   gsi_next (&gsi))
+	{
+	  for_body_stmt = gsi_stmt (gsi);
+	  if (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL)
+	    break;
+	}
+      gcc_assert (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL);
+      tree orig_clses = gimple_omp_parallel_clauses (for_body_stmt);
+      tree new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = init_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = cond_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = incr_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      gimple_omp_parallel_set_clauses (for_body_stmt, new_clause);
+    }
   if (orig_for_stmt != for_stmt)
     for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
       {
@@ -6825,6 +6904,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break;
     case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break;
     case CILK_SIMD: kind = GF_OMP_FOR_KIND_CILKSIMD; break;
+    case CILK_FOR: kind = GF_OMP_FOR_KIND_CILKFOR; break;
     case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break;
     default:
       gcc_unreachable ();
@@ -7897,6 +7977,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	case OMP_FOR:
 	case OMP_SIMD:
 	case CILK_SIMD:
+	case CILK_FOR:
 	case OMP_DISTRIBUTE:
 	  ret = gimplify_omp_for (expr_p, pre_p);
 	  break;
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 5a09b33..c3cea3d 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "tree-nested.h"
 #include "tree-eh.h"
+#include "cilk.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -198,6 +199,13 @@ struct omp_for_data
   struct omp_for_data_loop *loops;
 };
 
+/* A structure with necessary elements from _Cilk_for statement.  This
+   structure node is passed to WALK_STMT_INFO->INFO.  */
+struct cilk_for_info
+{
+  bool found;
+  tree induction_var;
+};
 
 static splay_tree all_contexts;
 static int taskreg_nesting_level;
@@ -314,6 +322,9 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
   fd->have_ordered = false;
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
   fd->chunk_size = NULL_TREE;
+  if (flag_enable_cilkplus 
+      && gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
+    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
   collapse_iter = NULL;
   collapse_count = NULL;
 
@@ -328,7 +339,13 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	break;
       case OMP_CLAUSE_SCHEDULE:
 	gcc_assert (!distribute);
-	fd->sched_kind = OMP_CLAUSE_SCHEDULE_KIND (t);
+	/* In _Cilk_for the sched kind doesn't make sense since we have
+	   our own scheduling.  This check is done to make sure we do not
+	   hit the asserts given below since we are purposely setting
+	   the sched_kind and the chunk size to hold the grain.  */
+	if (flag_enable_cilkplus
+	    && gimple_omp_for_kind (fd->for_stmt) != GF_OMP_FOR_KIND_CILKFOR)
+	  fd->sched_kind = OMP_CLAUSE_SCHEDULE_KIND (t);
 	fd->chunk_size = OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (t);
 	break;
       case OMP_CLAUSE_DIST_SCHEDULE:
@@ -391,8 +408,10 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	case GT_EXPR:
 	  break;
 	case NE_EXPR:
-	  gcc_assert (gimple_omp_for_kind (for_stmt)
-		      == GF_OMP_FOR_KIND_CILKSIMD);
+	  gcc_assert ((gimple_omp_for_kind (for_stmt)
+		       == GF_OMP_FOR_KIND_CILKSIMD)
+		      || (gimple_omp_for_kind (for_stmt)
+			  == GF_OMP_FOR_KIND_CILKFOR));
 	  break;
 	case LE_EXPR:
 	  if (POINTER_TYPE_P (TREE_TYPE (loop->n2)))
@@ -897,7 +916,31 @@ use_pointer_for_field (tree decl, omp_context *shared_ctx)
 	 variable no longer really shared.  */
       if (shared_ctx->is_nested)
 	{
-	  omp_context *up;
+	  omp_context *up = shared_ctx->outer;
+
+	  /* If VAR is the induction variable of the outer _Cilk_for, then
+	     it needs to be passed as a value not pointer since it
+	     would not be overwritten by the body.  */
+	  if (flag_enable_cilkplus
+	      && gimple_code (up->stmt) == GIMPLE_OMP_FOR
+	      && gimple_omp_for_kind (up->stmt) == GF_OMP_FOR_KIND_CILKFOR) 
+	    while (up) 
+	      { 
+		if (gimple_code (up->stmt) == GIMPLE_OMP_FOR
+		    && gimple_omp_for_kind (up->stmt)
+		    == GF_OMP_FOR_KIND_CILKFOR)
+		  {
+		    struct omp_for_data fd;
+		    /* _Cilk_for always has collapse = 1.  */
+		    struct omp_for_data_loop *loops
+		      = (struct omp_for_data_loop *)
+		      alloca (sizeof (struct omp_for_data_loop));
+		    extract_omp_for_data (up->stmt, &fd, loops);
+		    if (DECL_NAME (decl) == DECL_NAME (fd.loop.v))
+		      return false;
+		  }
+		up = up->outer;
+	      }
 
 	  for (up = shared_ctx->outer; up; up = up->outer)
 	    if (is_taskreg_ctx (up) && maybe_lookup_decl (decl, up))
@@ -1818,27 +1861,107 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	scan_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx);
 }
 
-/* Create a new name for omp child function.  Returns an identifier.  */
+/* Create a new name for omp child function.  Returns an identifier.  If 
+   IS_CILK_FOR is true then the suffix for the child function is 
+   "_cilk_for_fn."  */
 
 static tree
-create_omp_child_function_name (bool task_copy)
+create_omp_child_function_name (bool task_copy, bool is_cilk_for)
 {
+  if (is_cilk_for)
+    return clone_function_name (current_function_decl, "_cilk_for_fn");
   return (clone_function_name (current_function_decl,
 			       task_copy ? "_omp_cpyfn" : "_omp_fn"));
 }
 
+/* Helper function for walk_gimple_seq function.  *GSI_P is the gimple stmt.
+   iterator passed by walk_gimple_seq and *WI->INFO holds the CILK_FOR_INFO
+   structure.  This function sets the values inside this structure if it
+   finds a _Cilk_for statement in *GSI_P.  HANDLED_OPS_P is unused.  */
+
+static tree
+find_cilk_for_stmt (gimple_stmt_iterator *gsi_p,
+		    bool *handled_ops_p ATTRIBUTE_UNUSED,
+		    struct walk_stmt_info *wi)
+{
+  struct cilk_for_info *cf_info = (struct cilk_for_info *) wi->info;
+  gimple stmt = gsi_stmt (*gsi_p);
+
+  if (gimple_code (stmt) == GIMPLE_OMP_FOR
+      && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_CILKFOR)
+      /* For nested _Cilk_for statments, just look into the
+	 outer-most one.  */
+      && cf_info->found == false)
+    {
+      cf_info->found = true;
+      cf_info->induction_var = gimple_cilk_for_induction_var (stmt);
+    }
+  return NULL_TREE;
+}
+
+/* Returns true if STMT contains a CILK_FOR statement.  If found then
+   set *IND_VAR with induction variable.  Otherwise these values remain 
+   untouched.  IND_VAR can be NULL and if so then it is left untouched.  */
+
+static bool
+is_cilk_for_stmt (gimple stmt, tree *ind_var)
+{
+  gimple_seq body = stmt;
+  struct walk_stmt_info wi;
+  struct cilk_for_info cf_info;
+  memset (&cf_info, 0, sizeof (struct cilk_for_info));
+  memset (&wi, 0, sizeof (wi));
+  wi.info = &cf_info;
+  walk_gimple_seq (body, find_cilk_for_stmt, NULL, &wi);
+  if (cf_info.found)
+    {
+      if (ind_var)
+	*ind_var = cf_info.induction_var;
+      return true;
+    }
+    
+  return false;
+}
+
+/* Returns the type of the induction variable based on TYPE.  */
+
+static tree
+cilk_for_check_loop_diff_type (tree type)
+{
+  if (type == integer_type_node)
+    return type;
+  else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node))
+    { 
+      if (TYPE_UNSIGNED (type)) 
+	return uint32_type_node;
+      else
+	return integer_type_node;
+    }
+  else
+    {
+      if (TYPE_UNSIGNED (type)) 
+	return uint64_type_node;
+      else
+	return long_long_integer_type_node;
+    }
+}
+
 /* Build a decl for the omp child function.  It'll not contain a body
    yet, just the bare decl.  */
 
 static void
-create_omp_child_function (omp_context *ctx, bool task_copy)
+create_omp_child_function (omp_context *ctx, bool task_copy,
+			   bool is_cilk_for, tree cilk_var_type)
 {
   tree decl, type, name, t;
-
-  name = create_omp_child_function_name (task_copy);
+ 
+  name = create_omp_child_function_name (task_copy, is_cilk_for);
   if (task_copy)
     type = build_function_type_list (void_type_node, ptr_type_node,
 				     ptr_type_node, NULL_TREE);
+  else if (is_cilk_for)
+    type = build_function_type_list (void_type_node, ptr_type_node,
+				     cilk_var_type, cilk_var_type, NULL_TREE);
   else
     type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
 
@@ -1888,6 +2011,34 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_CONTEXT (t) = decl;
   DECL_RESULT (decl) = t;
 
+  /* _Cilk_for's child function requires two extra parameters called 
+     __low and __high that are set the by Cilk runtime when it calls this 
+     function.  Please refer to the note in expand_cilk_for_body for
+     explanation of __high and __low parameters.  */
+  if (is_cilk_for)
+    {
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__high"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__low"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+    }
   t = build_decl (DECL_SOURCE_LOCATION (decl),
 		  PARM_DECL, get_identifier (".omp_data_i"), ptr_type_node);
   DECL_ARTIFICIAL (t) = 1;
@@ -1895,6 +2046,8 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_ARG_TYPE (t) = ptr_type_node;
   DECL_CONTEXT (t) = current_function_decl;
   TREE_USED (t) = 1;
+  if (is_cilk_for)
+    DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
   DECL_ARGUMENTS (decl) = t;
   if (!task_copy)
     ctx->receiver_decl = t;
@@ -2016,7 +2169,15 @@ scan_omp_parallel (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+
+  tree ind_var = NULL_TREE;
+  bool is_cilk_for = (flag_enable_cilkplus && outer_ctx
+		      && is_cilk_for_stmt (outer_ctx->stmt, &ind_var));
+  tree cilk_var_type =
+    (is_cilk_for ? cilk_for_check_loop_diff_type (TREE_TYPE (ind_var))
+     : NULL_TREE);
+
+  create_omp_child_function (ctx, false, is_cilk_for, cilk_var_type);
   gimple_omp_parallel_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_parallel_clauses (stmt), ctx);
@@ -2061,7 +2222,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+  create_omp_child_function (ctx, false, false, NULL_TREE);
   gimple_omp_task_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_task_clauses (stmt), ctx);
@@ -2074,7 +2235,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
       DECL_ARTIFICIAL (name) = 1;
       DECL_NAMELESS (name) = 1;
       TYPE_NAME (ctx->srecord_type) = name;
-      create_omp_child_function (ctx, true);
+      create_omp_child_function (ctx, true, false, NULL_TREE);
     }
 
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
@@ -2199,7 +2360,7 @@ scan_omp_target (gimple stmt, omp_context *outer_ctx)
   TYPE_NAME (ctx->record_type) = name;
   if (kind == GF_OMP_TARGET_KIND_REGION)
     {
-      create_omp_child_function (ctx, false);
+      create_omp_child_function (ctx, false, false, NULL_TREE);
       gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
     }
 
@@ -2993,6 +3154,15 @@ lower_rec_simd_input_clauses (tree new_var, omp_context *ctx, int &max_vf,
   return true;
 }
 
+/* Returns true if the variable name in DECL matches *NAME.  */
+
+static inline bool
+is_cilk_loop_var (tree decl, const char *name)
+{
+  return (DECL_NAME (decl) && !strncmp (IDENTIFIER_POINTER (DECL_NAME (decl)), 
+					name, strlen (name))); 
+}
+
 /* Generate code to implement the input clauses, FIRSTPRIVATE and COPYIN,
    from the receiver (aka child) side and initializers for REFERENCE_TYPE
    private variables.  Initialization statements go in ILIST, while calls
@@ -3245,6 +3415,18 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 	      SET_DECL_VALUE_EXPR (new_var, x);
 	      DECL_HAS_VALUE_EXPR_P (new_var) = 1;
 
+	      /* In _Cilk_for, the increment, start and final values
+		 are stored in the clause inserted by gimplify_omp_for.  
+		 This value is used by the child function to find the 
+		 appropriate induction value function based on the 
+		 high and low parameters of the child function.  
+		 Now, we need to store the decl value expressions here so 
+		 that we can easily access them.  */
+	      if (flag_enable_cilkplus 
+		  && (is_cilk_loop_var (var, "__cilk_init") 
+		      || is_cilk_loop_var (var, "__cilk_cond")
+		      || is_cilk_loop_var (var, "__cilk_incr"))) 
+		SET_DECL_VALUE_EXPR (var, x);
 	      /* ??? If VAR is not passed by reference, and the variable
 		 hasn't been initialized yet, then we'll get a warning for
 		 the store into the omp_data_s structure.  Ideally, we'd be
@@ -4628,6 +4810,252 @@ expand_omp_build_assign (gimple_stmt_iterator *gsi_p, tree to, tree from)
     }
 }
 
+/* Returns true if T is a tree whose code is COMPONENT_REF and its field
+   matches D_F_NAME and the data argument matches D_ARG_NAME.  */
+
+static bool
+cilk_find_field_value (tree t, tree d_arg_name, tree d_f_name)
+{
+  if (TREE_CODE (t) == COMPONENT_REF)
+    {
+      tree arg = TREE_OPERAND (t, 0);
+      tree field = TREE_OPERAND (t, 1);
+      if (TREE_CODE (arg) == ADDR_EXPR || TREE_CODE (arg) == MEM_REF)
+	arg = TREE_OPERAND (arg, 0);
+      if (DECL_NAME (arg) && DECL_NAME (field)
+	  && !strcmp (IDENTIFIER_POINTER (d_arg_name),
+		      IDENTIFIER_POINTER (DECL_NAME (arg)))
+	  && !strcmp (IDENTIFIER_POINTER (d_f_name),
+		      IDENTIFIER_POINTER (DECL_NAME (field)))) 
+	return true;
+    }
+  return false;
+}
+
+/* Find the COMPONENT_REF in all the basic blocks in REGION whose 
+   data-argument is DATA_ARG and field is FIELD and then replace that 
+   COMPONENT_REF value with NEW_VALUE, a VAR_DECL.  */
+
+static void
+cilk_for_find_component_expr (struct omp_region *region, tree data_arg,
+			      tree field, tree new_value)
+{
+  vec<basic_block> bbs;
+  basic_block bb;
+  unsigned ii;
+  tree new_val = NULL_TREE;
+  bbs.create (0);
+  gather_blocks_in_sese_region (region->entry, region->exit, &bbs);
+  /* No need to push the entry bb into BBS since it doesn't get inserted
+     into the child function.  */
+  
+  tree da_name = DECL_NAME (data_arg);
+  tree df_name = DECL_NAME (field);
+  FOR_EACH_VEC_ELT (bbs, ii, bb)    
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+	 gsi_next (&gsi))
+      {
+	gimple stmt = gsi_stmt (gsi);
+	if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	  for (unsigned jj = 1; jj < gimple_num_ops (stmt); jj++)
+	    {
+	      tree *op = gimple_op_ptr (stmt, jj);
+	      if (TREE_CODE (*op) == COMPONENT_REF
+		  && cilk_find_field_value (*op, da_name, df_name))
+		{    
+		  if (TREE_TYPE (*op) == TREE_TYPE (new_value))
+		    new_val = new_value;
+		  else
+		    {
+		      tree t = fold_convert (TREE_TYPE (*op), new_value);
+		      new_val =
+			force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+						  true, GSI_NEW_STMT);
+		    }
+		  gsi_insert_before (&gsi, gimple_build_assign (*op, new_val), 
+				     GSI_NEW_STMT);
+		  *op = new_val;
+		}
+	    }
+      }
+}
+
+/* Find the initial, final and increment values in BODY_STMT's clause
+   and store them in *INIT, *FINAL and *INCR parameters respectively.  */
+
+static void
+find_cilk_for_vars (gimple body_stmt, tree *init, tree *final, tree *incr)
+{
+  /* Initial, final and increment value all start with __cilk_init,
+     __cilk_cond and __cilk_incr, respectively.  These values are defined
+     in shared clause.  Thus, we search for those.  */
+  for (tree cc = gimple_omp_parallel_clauses (body_stmt); cc; 
+       cc = OMP_CLAUSE_CHAIN (cc))
+    if (OMP_CLAUSE_CODE (cc) == OMP_CLAUSE_SHARED)
+      {
+	tree decl = OMP_CLAUSE_DECL (cc);
+	if (is_cilk_loop_var (decl, "__cilk_incr"))
+	  { 
+	    *incr = decl;
+	    if (DECL_VALUE_EXPR (*incr))
+	      *incr = DECL_VALUE_EXPR (*incr);
+	  } 
+	else if (is_cilk_loop_var (decl, "__cilk_init"))
+	  { 
+	    *init = decl;
+	    if (DECL_VALUE_EXPR (*init))
+	      *init = DECL_VALUE_EXPR (*init);
+	  }
+	else if (is_cilk_loop_var (decl, "__cilk_cond"))
+	  { 
+	    *final = decl;
+	    if (DECL_VALUE_EXPR (*final))
+	      *final = DECL_VALUE_EXPR (*final);
+	  }
+      }
+}
+ 
+/* Expand the _Cilk_for body starting at REGION.  DATA_ARG, HIGH and LOW 
+   indicates data-argument, __high and __low parameters of the child 
+   function.  
+   Note: __high and __low are parameters passed in by the Cilk Runtime to 
+   indicate the start and end of section.  */
+
+static void
+expand_cilk_for_body (struct omp_region *region, tree data_arg,
+		      tree low, tree high)
+{
+  struct omp_for_data fd;
+  struct omp_for_data_loop *loops;
+  loops
+    = (struct omp_for_data_loop *)
+      alloca (gimple_omp_for_collapse (last_stmt (region->outer->entry))
+	      * sizeof (struct omp_for_data_loop));
+  extract_omp_for_data (last_stmt (region->outer->entry), &fd, loops);
+  region->sched_kind = fd.sched_kind;
+  basic_block entry_bb = region->entry;
+  
+  /* This is where the body is and the location where we must insert
+     the modification to the induction variable.  */
+  basic_block body_bb = single_succ (region->entry);
+  gimple entry_stmt = last_stmt (region->entry);
+  
+  /* Split the first basic block into two and put the initializer values
+     in the top one.  */
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  basic_block l1_bb = split_block (entry_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (l1_bb);
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd.loop.v));
+  tree ind_var = create_tmp_reg (type, "__cilk_ind_var");
+  tree t = fold_convert (type, low);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_NEW_STMT);
+  gimple stmt = gimple_build_assign (ind_var, fold_convert (type, t));
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  vec_alloc (region->ws_args, 2);
+  tree t1 = null_pointer_node;
+  tree t2 = build_fold_addr_expr (gimple_omp_parallel_child_fn (entry_stmt));
+  if (data_arg)
+    {
+      t1 = build_fold_addr_expr (gimple_omp_parallel_data_arg (entry_stmt));
+      gsi = gsi_start_bb (body_bb);
+      tree init = NULL_TREE, final_val = NULL_TREE, incr = NULL_TREE;
+      find_cilk_for_vars (entry_stmt, &init, &final_val, &incr);
+
+      tree step = fd.loop.step;
+      if (TREE_CODE (fd.loop.step) != INTEGER_CST)
+	step = incr;      
+      step = fold_convert (type, step);
+      if (TREE_CODE (step) == INTEGER_CST && tree_int_cst_sgn (step) < 1)
+	step = fold_build1_loc (UNKNOWN_LOCATION, NEGATE_EXPR, type, step);
+      
+      tree tmp = create_tmp_reg (type, NULL);
+      gsi_insert_before (&gsi, gimple_build_assign (tmp, step),
+			 GSI_NEW_STMT);
+      t = build2 (MULT_EXPR, type, ind_var, tmp);
+      tree tmp2 = create_tmp_reg (type, NULL);
+      gsi_insert_after (&gsi, gimple_build_assign (tmp2, t), GSI_NEW_STMT);
+
+      tmp = create_tmp_reg (type, NULL);
+      init = fold_convert (type, init);
+      tree init_tmp = force_gimple_operand_gsi
+	(&gsi, init, true, NULL_TREE, false, GSI_CONTINUE_LINKING); 
+
+      gsi_insert_after (&gsi, gimple_build_assign (tmp, init_tmp), 
+			GSI_NEW_STMT);
+      if (fd.loop.cond_code == GE_EXPR || fd.loop.cond_code == GT_EXPR) 
+	t = fold_build2 (MINUS_EXPR, type, tmp, tmp2);
+      else 
+	t = fold_build2 (PLUS_EXPR, type, tmp, tmp2);
+
+      t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false, 
+				    GSI_CONTINUE_LINKING);
+      tree tmp3 = create_tmp_reg (type, NULL);
+      gimple stmt = gimple_build_assign (tmp3, t);
+      gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+      cilk_for_find_component_expr (region, data_arg, fd.loop.v, tmp3);
+    }
+  region->ws_args->quick_push (t1);
+  region->ws_args->quick_push (t2);
+  
+  gsi = gsi_last_bb (l1_bb);
+  basic_block cond_bb = split_block (l1_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (l1_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (cond_bb);
+  t = fold_convert (type, high);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_CONTINUE_LINKING);
+  t = build2 (LT_EXPR, boolean_type_node, ind_var, t);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p,
+		 NULL, NULL)
+      || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p,
+		    NULL, NULL))
+    {
+      gsi = gsi_for_stmt (stmt);
+      gimple_regimplify_operands (stmt, &gsi);
+    }
+
+  /* Insert incrementing of induction variable.  */
+  gsi = gsi_last_bb (body_bb);
+  t = build2 (PLUS_EXPR, type, ind_var, build_one_cst (type));
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+  gsi_insert_after (&gsi, gimple_build_assign (ind_var, t),
+		    GSI_CONTINUE_LINKING);
+  
+  basic_block exit_bb = region->exit;
+
+  gsi = gsi_last_bb (exit_bb);
+  basic_block last_bb = split_block (exit_bb, gsi_stmt (gsi))->dest;
+  
+  /* Remove the #pragma omp return.  */
+  gsi = gsi_last_bb (exit_bb);
+  gsi_remove (&gsi, true);
+  
+  gsi = gsi_last_bb (last_bb);
+  gsi_insert_before (&gsi, gimple_build_return (NULL), GSI_SAME_STMT);
+  
+  /* Now connect all the basic-blocks.  */
+  edge e = make_edge (cond_bb, last_bb, EDGE_FALSE_VALUE);
+  e->probability = REG_BR_PROB_BASE / 4;
+
+  edge e3 = find_edge (cond_bb, body_bb);
+  e3->probability = REG_BR_PROB_BASE * 3 / 4;
+  e3->flags = EDGE_TRUE_VALUE;
+  
+  edge e2 = find_edge (exit_bb, last_bb);
+  remove_edge (e2);
+  e2 = make_edge (exit_bb, cond_bb, EDGE_FALLTHRU);
+  e2->probability = 1;
+  region->exit = last_bb;
+}
+
 /* Expand the OpenMP parallel or task directive starting at REGION.  */
 
 static void
@@ -4640,6 +5068,7 @@ expand_omp_taskreg (struct omp_region *region)
   gimple entry_stmt, stmt;
   edge e;
   vec<tree, va_gc> *ws_args;
+  gimple parcopy_stmt = NULL;
 
   entry_stmt = last_stmt (region->entry);
   child_fn = gimple_omp_taskreg_child_fn (entry_stmt);
@@ -4648,6 +5077,16 @@ expand_omp_taskreg (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
+  /* The way _Cilk_for is constructed in the compiler is like making
+     the _Cilk_for statment a #pragma OMP for and the body of it is
+     enclosed in #pragma omp parallel.  In this routine, we handle
+     inserting the body into the child function and putting a loop around
+     it to go from low to high.  NOTE: Even though this is how the 
+     compiler breaks them, they do NOT function the same way.  */
+  bool is_cilk_for =
+    (flag_enable_cilkplus && region->outer
+     && is_cilk_for_stmt (last_stmt (region->outer->entry), NULL));
+    
   if (is_combined_parallel (region))
     ws_args = region->ws_args;
   else
@@ -4698,7 +5137,6 @@ expand_omp_taskreg (struct omp_region *region)
 	  basic_block entry_succ_bb = single_succ (entry_bb);
 	  gimple_stmt_iterator gsi;
 	  tree arg, narg;
-	  gimple parcopy_stmt = NULL;
 
 	  for (gsi = gsi_start_bb (entry_succ_bb); ; gsi_next (&gsi))
 	    {
@@ -4755,6 +5193,29 @@ expand_omp_taskreg (struct omp_region *region)
 	    }
 	}
 
+      /* Extract the __high and __low parameter from the function.  */
+      tree high_arg = NULL_TREE, low_arg = NULL_TREE;
+      if (is_cilk_for)
+	{
+	  for (tree ii_arg = DECL_ARGUMENTS (child_fn); ii_arg != NULL_TREE;
+	       ii_arg = TREE_CHAIN (ii_arg))
+	    {
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)),
+			   "__high"))
+		high_arg = ii_arg;
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__low"))
+		low_arg = ii_arg;
+	    }
+	  gcc_assert (high_arg);
+	  gcc_assert (low_arg);
+	  expand_cilk_for_body (region, gimple_get_lhs (parcopy_stmt),
+				low_arg, high_arg);
+
+	  /* A new BB is added to the end of EXIT_BB and thus it needs to be
+	     updated.  */
+	  exit_bb = region->exit;
+	}
+
       /* Declare local variables needed in CHILD_CFUN.  */
       block = DECL_INITIAL (child_fn);
       BLOCK_VARS (block) = vec2chain (child_cfun->local_decls);
@@ -4787,7 +5248,7 @@ expand_omp_taskreg (struct omp_region *region)
       single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
 
       /* Convert GIMPLE_OMP_RETURN into a RETURN_EXPR.  */
-      if (exit_bb)
+      if (exit_bb && !is_cilk_for)
 	{
 	  gsi = gsi_last_bb (exit_bb);
 	  gcc_assert (!gsi_end_p (gsi)
@@ -4861,11 +5322,16 @@ expand_omp_taskreg (struct omp_region *region)
       pop_cfun ();
     }
 
-  /* Emit a library call to launch the children threads.  */
-  if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
-    expand_parallel_call (region, new_bb, entry_stmt, ws_args);
-  else
-    expand_task_call (new_bb, entry_stmt);
+  /* In _Cilk_for, the call to the runtime function is inserted by
+     expand_omp_for.  */
+  if (!is_cilk_for)
+    {
+      /* Emit a library call to launch the children threads.  */
+      if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
+	expand_parallel_call (region, new_bb, entry_stmt, ws_args);
+      else
+	expand_task_call (new_bb, entry_stmt);
+    }
   if (gimple_in_ssa_p (cfun))
     update_ssa (TODO_update_ssa_only_virtuals);
 }
@@ -6540,6 +7006,127 @@ expand_omp_for_static_chunk (struct omp_region *region,
     }
 }
 
+/* A subroutine of expand_omp_for.  Insert the function call to the
+   cilk library function-call: __cilkrts_cilk_for_64/32 into the end of
+   REGION.  Loop information is calculated using step, n1 and n2 from FD.  */
+
+static void
+insert_cilk_for_fn_call (struct omp_region *region, struct omp_for_data *fd)
+{
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  basic_block entry_bb = region->entry;
+  bool broken_loop = region->cont == NULL;
+  basic_block cont_bb = region->cont;
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  tree diff_type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+
+  tree clauses = gimple_omp_for_clauses (fd->for_stmt);
+
+  tree grain = find_omp_clause (clauses, OMP_CLAUSE_SCHEDULE);
+  gcc_assert (grain != NULL_TREE);
+  grain = OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (grain);
+  
+  /* Convert n2 and n1 to the type we need.  */
+  tree n1 = fold_convert (diff_type, fd->loop.n1);
+  tree n2 = fold_convert (diff_type, fd->loop.n2);
+
+  n1 = force_gimple_operand_gsi (&gsi, n1, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  n2 = force_gimple_operand_gsi (&gsi, n2, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  tree diff_val = fold_build2 (MINUS_EXPR, diff_type, n2, n1);
+
+  diff_val = force_gimple_operand_gsi (&gsi, diff_val, true, NULL_TREE,
+					    true, GSI_SAME_STMT);
+  tree step = fd->loop.step;
+  tree step_convert = force_gimple_operand_gsi (&gsi,
+						fold_convert (diff_type, step),
+						true, NULL_TREE, true,
+						GSI_SAME_STMT);
+  tree count = fold_build2 (TRUNC_DIV_EXPR, diff_type, diff_val, step_convert);
+  count = force_gimple_operand_gsi (&gsi, count, true, NULL_TREE, true,
+				    GSI_SAME_STMT);
+
+  tree data_arg_ptr = (*region->ws_args)[0];
+  tree child_fn = (*region->ws_args)[1];
+
+  tree lib_fun = NULL_TREE;
+  if (TYPE_PRECISION (type) == 32)
+    lib_fun = cilk_for_32_fndecl;
+  else if (TYPE_PRECISION (type) == 64)
+    lib_fun = cilk_for_64_fndecl;
+  else
+    gcc_unreachable ();
+
+  vec<tree, va_gc> *args;
+  vec_alloc (args, 4);
+  args->quick_push (child_fn);
+  args->quick_push (data_arg_ptr);
+  args->quick_push (count);
+  args->quick_push (grain);
+  tree t = build_call_expr_loc_vec (UNKNOWN_LOCATION, lib_fun, args);
+  gsi_remove (&gsi, true);
+
+  if (!broken_loop)
+    {
+      /* Code to control the increment goes in the CONT_BB.  */
+      gsi = gsi_last_bb (cont_bb);
+      gimple stmt = gsi_stmt (gsi);
+      gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE);
+      gsi_remove (&gsi, true);
+      
+      /* remove the edge to OMP continue block.  */
+      unsigned int ii = 0;
+      while (EDGE_COUNT (cont_bb->succs) > 1)
+	{
+	  edge ee = EDGE_SUCC (cont_bb, ii);
+	  if (!(ee->flags & EDGE_FALLTHRU))
+	    remove_edge (ee);
+	  ii++;
+	}      
+      gsi = gsi_start_bb (cont_bb);
+      gsi_remove (&gsi, true);
+      force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+    }
+
+  /* Remove GIMPLE_OMP_RETURN.  */
+  gsi = gsi_last_bb (region->exit);
+  gimple stmt = gsi_stmt (gsi);
+  gcc_assert (gimple_code (stmt) == GIMPLE_OMP_RETURN);
+  gsi_remove (&gsi, true);
+
+  gsi = gsi_last_bb (region->entry);
+  t = fold_build2 (fd->loop.cond_code, boolean_type_node, n1, n2);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  /* In here we are replacing a _Cilk_for statement with something
+     like this:
+
+     if (n1 <cond_code> n2)
+       goto bb1
+     else
+       goto bb2
+     
+     bb1:
+       .omp_data.o.__cilk_incr = __cilk_incr;
+       ...
+       __cilkrts_cilk_for_{32/64} (func_name, &omp_data_0, <count>, <grain>);
+
+     bb2:
+     clobber all values and go out.  */  
+  unsigned int ii = 0;
+  while (ii < EDGE_COUNT (region->entry->succs))
+    {
+      edge ee = EDGE_SUCC (region->entry, ii);
+      if (ee->flags & EDGE_FALLTHRU)
+	ee->flags = EDGE_TRUE_VALUE;
+      else
+	ee->flags = EDGE_FALSE_VALUE;
+      ii++;
+    }
+}
 
 /* A subroutine of expand_omp_for.  Generate code for a simd non-worksharing
    loop.  Given parameters:
@@ -6880,6 +7467,12 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt)
 
   if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_SIMD)
     expand_omp_simd (region, &fd);
+  else if (flag_enable_cilkplus 
+	   && (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_CILKFOR))
+    {
+      region->ws_args = region->inner->ws_args;
+      insert_cilk_for_fn_call (region, &fd);
+    }
   else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
 	   && !fd.have_ordered)
     {
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
new file mode 100644
index 0000000..a80f413
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
@@ -0,0 +1,100 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+static void check (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start;  ii < end; ii = ii + incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+static void check_reverse (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start; ii >= end; ii = ii - incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+
+int main (void)
+{
+  int Array[10];
+  int x = 9, y = 0, z = 3;
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 1133;
+  check (Array, 0, 10, 1, 1133);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 9; ii > -1; ii--)
+    Array[ii] = 4433;
+  check_reverse (Array, 9, 0, 1, 4433);
+
+  _Cilk_for (int ii = 9; ii > -1; --ii)
+    Array[ii] = 9988;
+  check_reverse (Array, 9, 0, 1, 9988);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    Array[ii] = 1328;
+  check (Array, 0, 10, 2, 1328);
+
+  _Cilk_for (int ii = 9; ii >= 0; ii -= 2)
+    Array[ii] = 1738;
+  check_reverse (Array, 9, 0, 2, 1738);
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if (ii % 2)
+	Array[ii] = 1343;
+      else
+	Array[ii] = 3413;
+    }
+
+  check (Array, 1, 10, 2, 1343); 
+  check (Array, 0, 10, 2, 3413); 
+
+  _Cilk_for (short cc = 0; cc < 10; cc++) 
+    Array[cc] = 1343;
+  check (Array, 0, 10,  1,1343);
+
+  _Cilk_for (short cc = 9; cc >= 0; cc--)
+    Array[cc] = 1348;
+  check_reverse (Array, 9, 0, 1, 1348);
+
+
+
+  /* Loop with polynomials in _Cilk_for.  */
+  _Cilk_for (int ii = z - 3; ii <= z * 3; ii += 2)
+    { 
+      Array[ii] = 3233;
+    }
+
+  for (int ii = z-3; ii <= z*3; ii += 2)
+    if (Array[ii] != 3233)
+      __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
new file mode 100644
index 0000000..ed73c34
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+int main (void)
+{
+  int q = 0, ii = 0, jj = 0;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */
+    /* { dg-error "expected" "" { target c++ } 10 } */
+    q = 5;
+
+  _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" "" { target c } } */
+    /* { dg-error "invalid controlling predicate" "" { target c++ }  20 } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
+    q = 5;
+
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected" } */ 
+    q = 5;
+
+  _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
+    q = 5;
+
+ 
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static" } */
+
+    q = 5;
+
+
+  _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
+    q = 5;
+
+
+  _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */
+    q = 5;
+
+  _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
+    q = 5;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */ 
+    /* { dg-error "expected" "" { target c++ }  53 } */
+    q = 5;
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
new file mode 100644
index 0000000..6cb9b03
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
@@ -0,0 +1,35 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int grain_value = 2;
+int main (void)
+{
+  int Array1[200], Array1_Serial[200];
+
+  for (int ii = 0; ii < 200; ii++)
+    {
+      Array1_Serial[ii] = 2;
+      Array1[ii] = 1;
+    }
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 200; ii++)
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+#pragma cilk grainsize = grain_value
+  _Cilk_for (int ii = 0; ii < 200; ii++) 
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
new file mode 100644
index 0000000..e1e3217
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+char Array1[26];
+
+#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */
+
+int main(int argc, char **argv)
+{
+/* This is OK.  */
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+
+/* This is OK, it will do a type conversion to long int.  */
+#pragma cilk grainsize = 0.5 
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize = 1 
+  while (Array1[5] != 0) /* { dg-warning "is not followed by" } */
+    {
+    /* Blah */
+    }
+
+#pragma cilk grainsize = 1 
+  int q = 0; /* { dg-warning "is not followed by" } */
+  _Cilk_for (q = 0; q < 10; q++)
+    Array1[q]  = 5;
+
+  while (Array1[5] != 0)
+    {
+    /* Blah */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
new file mode 100644
index 0000000..7a779f7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
@@ -0,0 +1,41 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+/* <feature> loop control variable must have integer, pointer or class type
+   </feature>
+*/
+
+#define ARRAY_SIZE 10000
+int a[ARRAY_SIZE];
+
+int main(void)
+{ 
+  int ii = 0;
+
+#if 1
+  for (ii =0; ii < ARRAY_SIZE; ii++)
+    a[ii] = 5;
+#endif
+  _Cilk_for(int *aa = a; aa < a + ARRAY_SIZE; aa++) 
+    *aa = 0;
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii++) 
+    if (a[ii] != 0) 
+      __builtin_abort ();
+#endif
+
+  _Cilk_for (int *aa = a; aa < a + ARRAY_SIZE; aa = aa + 2)
+    *aa = 4;
+
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii = ii + 2) 
+    if (a[ii] != 4) 
+      __builtin_abort ();
+#endif
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
new file mode 100644
index 0000000..cffe17e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
@@ -0,0 +1,79 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+int main (void)
+{
+  int Array[10][10];
+
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj++)
+	{
+	  Array[ii][jj] = 0;
+	}
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 5; jj++)
+      Array[ii][jj] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 5; jj++)
+      if (Array[ii][jj] != 5)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+
+  /* One goes up and one goes down.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 9; jj >= 0; jj--)
+      Array[ii][jj] = 7;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 9; jj >= 0; jj--)
+      if (Array[ii][jj] != 7)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 10; jj += 2)
+      Array[ii][jj] = 9;
+  
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj += 2)
+      if (Array[ii][jj] != 9)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    _Cilk_for (int jj = 5; jj < 9; jj++)
+      Array[ii][jj] = 11; 
+  
+  for (int ii = 0; ii < 10; ii += 2)
+    for (int jj = 5; jj < 9; jj++)
+      if (Array[ii][jj] != 11)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index e548a0d..d8c14e3 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -351,6 +351,7 @@ enum omp_clause_schedule_kind {
   OMP_CLAUSE_SCHEDULE_GUIDED,
   OMP_CLAUSE_SCHEDULE_AUTO,
   OMP_CLAUSE_SCHEDULE_RUNTIME,
+  OMP_CLAUSE_SCHEDULE_CILKFOR,
   OMP_CLAUSE_SCHEDULE_LAST
 };
 
diff --git a/gcc/tree.def b/gcc/tree.def
index f8d6444..558d7c8 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1051,6 +1051,10 @@ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6)
    Operands like for OMP_FOR.  */
 DEFTREECODE (CILK_SIMD, "cilk_simd", tcc_statement, 6)
 
+/* Cilk Plus - _Cilk_for (..)
+   Operands like for OMP_FOR.  */
+DEFTREECODE (CILK_FOR, "cilk_for", tcc_statement, 6)
+
 /* OpenMP - #pragma omp distribute [clause1 ... clauseN]
    Operands like for OMP_FOR.  */
 DEFTREECODE (OMP_DISTRIBUTE, "omp_distribute", tcc_statement, 6)

[-- Attachment #5: diff_c++.txt --]
[-- Type: text/plain, Size: 17324 bytes --]

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c3016bc..ce371f5 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -237,8 +237,8 @@ static void cp_parser_initial_pragma
 static tree cp_literal_operator_id
   (const char *);
 
-static void cp_parser_cilk_simd
-  (cp_parser *, cp_token *);
+static tree cp_parser_cilk_simd
+  (cp_parser *, cp_token *, tree);
 static bool cp_parser_omp_declare_reduction_exprs
   (tree, cp_parser *);
 
@@ -9375,6 +9375,18 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 	  statement = cp_parser_iteration_statement (parser, false);
 	  break;
 
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (cp_lexer_peek_token (parser->lexer)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      cp_lexer_consume_token (parser->lexer);
+	      statement = error_mark_node;
+	    }
+	  else
+	    statement = cp_parser_cilk_simd (parser, NULL, integer_zero_node);
+	  break;
+
 	case RID_BREAK:
 	case RID_CONTINUE:
 	case RID_RETURN:
@@ -28714,7 +28726,7 @@ cp_parser_omp_for_cond (cp_parser *parser, tree decl, enum tree_code code)
     case LE_EXPR:
       break;
     case NE_EXPR:
-      if (code == CILK_SIMD)
+      if (code == CILK_SIMD || code == CILK_FOR)
 	break;
       /* Fall through: OpenMP disallows NE_EXPR.  */
     default:
@@ -29039,11 +29051,18 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       bool add_private_clause = false;
       location_t loc;
 
-      if (!cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
+      if (code == CILK_SIMD
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
 	{
 	  cp_parser_error (parser, "for statement expected");
 	  return NULL;
 	}
+      if (code == CILK_FOR
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_CILK_FOR))
+	{
+	  cp_parser_error (parser, "_Cilk_for statement expected");
+	  return NULL;
+	}
       loc = cp_lexer_consume_token (parser->lexer)->location;
 
       if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
@@ -29052,13 +29071,26 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       init = decl = real_decl = NULL;
       this_pre_body = push_stmt_list ();
 
+      if (code == CILK_FOR
+	  && cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC))
+	{
+	  error_at (cp_lexer_peek_token (parser->lexer)->location,
+		    "induction variable cannot be static");
+	  cp_lexer_consume_token (parser->lexer);
+	}
       add_private_clause
 	|= cp_parser_omp_for_loop_init (parser,
-					/*parsing_openmp=*/code != CILK_SIMD,
+					/*parsing_openmp=*/
+					(code != CILK_SIMD && code != CILK_FOR),
 					this_pre_body, for_block,
 					init, decl, real_decl);
 
-      cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
+      if (!cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON)
+	  && code == CILK_FOR)
+	{
+	  cp_parser_skip_to_end_of_statement (parser);
+	  cp_parser_consume_semicolon_at_end_of_statement (parser);
+	}
       if (this_pre_body)
 	{
 	  this_pre_body = pop_stmt_list (this_pre_body);
@@ -29216,17 +29248,30 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
 
   /* Note that we saved the original contents of this flag when we entered
      the structured block, and so we don't need to re-save it here.  */
-  if (code == CILK_SIMD)
+  if (code == CILK_SIMD || code == CILK_FOR)
     parser->in_statement = IN_CILK_SIMD_FOR;
   else
     parser->in_statement = IN_OMP_FOR;
 
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = begin_omp_parallel ();
+    }
+
   /* Note that the grammar doesn't call for a structured block here,
      though the loop as a whole is a structured block.  */
   body = push_stmt_list ();
   cp_parser_statement (parser, NULL_TREE, false, NULL);
   body = pop_stmt_list (body);
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = finish_omp_parallel (NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
   if (declv == NULL_TREE)
     ret = NULL_TREE;
   else
@@ -31104,6 +31149,38 @@ cp_parser_initial_pragma (cp_token *first_token)
   cp_lexer_get_preprocessor_token (NULL, first_token);
 }
 
+/* Parses the grainsize pragma for the _Cilk_for statement.
+   Syntax:
+   #pragma cilk grainsize = <VALUE>.  */
+
+static void
+cp_parser_cilk_grainsize (cp_parser *parser, cp_token *pragma_tok)
+{
+  if (cp_parser_require (parser, CPP_EQ, RT_EQ))
+    {
+      tree exp = cp_parser_binary_expression (parser, false, false,
+                                              PREC_NOT_OPERATOR, NULL);
+      cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+      if (!exp || exp == error_mark_node)
+        {
+          error_at (pragma_tok->location, "invalid grainsize for _Cilk_for");
+          return;
+        }
+      cp_token *n_tok = cp_lexer_peek_token (parser->lexer);
+
+      /* Make sure the next token is _Cilk_for, it is invalid otherwise.  */
+      if (n_tok && n_tok->type == CPP_KEYWORD
+	  && n_tok->keyword == RID_CILK_FOR)
+	cp_parser_cilk_simd (parser, NULL, exp);
+      else
+	warning_at (cp_lexer_peek_token (parser->lexer)->location, 0,
+		    "%<#pragma cilk grainsize%> is not followed by "
+		    "%<_Cilk_for%>");
+      return;
+    }
+  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+}
+
 /* Normal parsing of a pragma token.  Here we can (and must) use the
    regular lexer.  */
 
@@ -31283,9 +31360,30 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
 		    "%<#pragma simd%> must be inside a function");
 	  break;
 	}
-      cp_parser_cilk_simd (parser, pragma_tok);
+      cp_parser_cilk_simd (parser, pragma_tok, NULL_TREE);
       return true;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (context == pragma_external)
+        {
+          error_at (pragma_tok->location,
+                    "%<#pragma cilk grainsize%> must be inside a function");
+          break;
+        }
+
+      /* Ignore the pragma if Cilk Plus is not enabled.  */
+      if (flag_enable_cilkplus)
+        {
+          cp_parser_cilk_grainsize (parser, pragma_tok);
+          return true;
+        }
+      else
+        {
+          error_at (pragma_tok->location, "-fcilkplus must be enabled to use "
+                    "%<#pragma cilk grainsize%>");
+          break;
+	}
+      
     default:
       gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
       c_invoke_pragma_handler (id);
@@ -31575,31 +31673,64 @@ cp_parser_cilk_simd_all_clauses (cp_parser *parser, cp_token *pragma_token)
     return c_finish_cilk_clauses (clauses);
 }
 
-/* Main entry-point for parsing Cilk Plus <#pragma simd> for loops.  */
+/* Main entry-point for parsing Cilk Plus <#pragma simd> for and _Cilk_for
+   loops.  This function returns NULL_TREE whenever it is parsing the
+   #pragma simd's for because the caller does not check the return value.
+   _Cilk_for's caller checks this value and thus return error_mark_node
+   when errors happen and a valid value when things go as expected.  */
 
-static void
-cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token)
+static tree
+cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token, tree grain)
 {
-  tree clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
+  bool is_cilk_for = !pragma_token ? true: false;
+  tree clauses = NULL_TREE;
+
+  if (!is_cilk_for)
+    clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
 
   if (clauses == error_mark_node)
-    return;
-  
-  if (cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
+    return NULL_TREE;
+
+  if (!is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
     {
       error_at (cp_lexer_peek_token (parser->lexer)->location,
 		"for statement expected");
-      return;
+      return NULL_TREE;
+    }
+  if (is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_CILK_FOR))
+    {
+      error_at (cp_lexer_peek_token (parser->lexer)->location,
+		"_Cilk_for statement expected");
+      return error_mark_node;
     }
 
   tree sb = begin_omp_structured_block ();
   int save = cp_parser_begin_omp_structured_block (parser);
-  tree ret = cp_parser_omp_for_loop (parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  tree ret = cp_parser_omp_for_loop (parser, code, clauses, NULL);
   if (ret)
     cpp_validate_cilk_plus_loop (OMP_FOR_BODY (ret));
+
+  /* For _Cilk_for statements, the grain value is stored in the same
+     location as clauses for OMP for inside a schedule clause.  */
+  if (is_cilk_for && ret)
+    { 
+      tree l = build_omp_clause (EXPR_LOCATION (grain),
+				 OMP_CLAUSE_SCHEDULE);
+      OMP_CLAUSE_SCHEDULE_KIND (l) = OMP_CLAUSE_SCHEDULE_RUNTIME;
+      OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (l) = grain;
+      OMP_CLAUSE_CHAIN (l) = NULL_TREE;
+      OMP_FOR_CLAUSES (ret) = l;
+    }
+
   cp_parser_end_omp_structured_block (parser, save);
-  add_stmt (finish_omp_structured_block (sb));
-  return;
+  tree stmt = finish_omp_structured_block (sb);
+  add_stmt (stmt);
+  if (is_cilk_for) 
+    return stmt;
+  return NULL_TREE;
 }
 
 /* Create an identifier for a generic parameter type (a synthesized
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2e7cf60..eae8e93 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13586,6 +13586,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
     case OMP_FOR:
     case OMP_SIMD:
     case CILK_SIMD:
+    case CILK_FOR:
     case OMP_DISTRIBUTE:
       {
 	tree clauses, body, pre_body;
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index eb04266..897fee5 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -5965,7 +5965,8 @@ finish_omp_task (tree clauses, tree body)
 static bool
 handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
 			       tree condv, tree incrv, tree *body,
-			       tree *pre_body, tree clauses)
+			       tree *pre_body, tree clauses,
+			       bool is_cilk_for)
 {
   tree diff, iter_init, iter_incr = NULL, last;
   tree incr_var = NULL, orig_pre_body, orig_body, c;
@@ -5985,6 +5986,7 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
     case GE_EXPR:
     case LT_EXPR:
     case LE_EXPR:
+    case NE_EXPR:
       if (TREE_OPERAND (cond, 1) == iter)
 	cond = build2 (swap_tree_comparison (TREE_CODE (cond)),
 		       TREE_TYPE (cond), iter, TREE_OPERAND (cond, 0));
@@ -6128,6 +6130,11 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
       break;
 
   decl = create_temporary_var (TREE_TYPE (diff));
+  /* In _Cilk_for we must know the induction variable name since it is
+     read by expand_cilk_for_body in omp-low.c to set the induction
+     variable in the child function correctly.  */
+  if (is_cilk_for)
+    DECL_NAME (decl) = make_anon_name ();
   pushdecl (decl);
   add_decl_expr (decl);
   last = create_temporary_var (TREE_TYPE (diff));
@@ -6343,8 +6350,24 @@ finish_omp_for (location_t locus, enum tree_code code, tree declv, tree initv,
 				"iteration variable %qE", decl);
 	      return NULL;
 	    }
-	  if (handle_omp_for_class_iterator (i, locus, declv, initv, condv,
-					     incrv, &body, &pre_body, clauses))
+
+	  /* In _Cilk_for, all the iterator mapping code should be
+	     inserted in the OMP_PARALLEL_BODY.  */
+	  if (code == CILK_FOR)
+	    {
+	      tree the_body = OMP_PARALLEL_BODY (body);
+	      if (TREE_CODE (the_body) == BIND_EXPR)
+		the_body = BIND_EXPR_BODY (the_body);
+	      if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						 condv, incrv, &the_body,
+						 &pre_body, clauses, true))
+		return NULL;
+	      else
+		BIND_EXPR_BODY (OMP_PARALLEL_BODY (body)) = the_body;
+	    }
+	  else if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						  condv, incrv, &body,
+						  &pre_body, clauses, false))
 	    return NULL;
 	  continue;
 	}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
new file mode 100644
index 0000000..8221371
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
@@ -0,0 +1,25 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define SIZE 100
+#define CHECK_VALUE 5
+
+template <class T>
+int func (T start, T end)
+{
+  int Array[SIZE];
+  _Cilk_for (T ii = 0; ii < end; ii++)
+    Array[ii] = CHECK_VALUE;
+  
+  for (T ii = 0; ii < end; ii++)
+    if (Array[ii] != CHECK_VALUE)
+      __builtin_abort ();
+
+  return 0;
+}
+
+int main (void)
+{
+  return func <int> (0, 100) + func <long> (0, 100);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
new file mode 100644
index 0000000..2ac8c72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
@@ -0,0 +1,52 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array;
+vector <int> array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back (ii);
+}
+#endif
+_Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end();
+          iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+for (vector<int>::iterator iter = array_serial.begin(); 
+     iter != array_serial.end(); iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
new file mode 100644
index 0000000..1cf3301
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
@@ -0,0 +1,72 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array,array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back(ii);
+}
+#endif
+_Cilk_for (vector<int>::reverse_iterator iter4 = array.rbegin(); 
+	   iter4 != array.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+
+_Cilk_for (vector<int>::reverse_iterator iter4 = array_serial.rbegin(); 
+	   iter4 != array_serial.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+_Cilk_for (vector<int>::reverse_iterator iter2 = array.rbegin(); 
+	   iter2 != array.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+for (vector<int>::reverse_iterator iter2 = array_serial.rbegin(); 
+     iter2 != array_serial.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
+
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
new file mode 100644
index 0000000..3e350a1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
@@ -0,0 +1,50 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <iostream>
+#include <cstdio>
+#include <cstdlib>
+#include <vector>
+#include <algorithm>
+#include <list>
+
+using namespace std;
+
+
+int main(int argc, char **argv)
+{
+  vector <int> number_list, number_list_serial;
+  int new_number = 0;
+  int no_elements = 0;
+  
+  if (argc != 2)
+  {
+    no_elements = 10000;
+  }
+
+
+  number_list.clear();
+  number_list_serial.clear();
+  for (int ii = 0; ii < no_elements; ii++)
+  {
+    number_list.push_back(new_number);
+    number_list_serial.push_back(new_number);
+  }
+
+  _Cilk_for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list[jj] = jj + no_elements;
+  }
+  for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list_serial[jj] = jj + no_elements;
+  }
+
+  for (int jj = 0; jj < no_elements; jj++)
+    if (number_list_serial[jj] != number_list[jj])
+      __builtin_abort ();
+
+  return 0;
+}

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2014-01-16 21:19                                       ` Aldy Hernandez
  2014-01-17  9:24                                         ` Marek Polacek
@ 2014-01-19  4:53                                         ` Iyer, Balaji V
  1 sibling, 0 replies; 42+ messages in thread
From: Iyer, Balaji V @ 2014-01-19  4:53 UTC (permalink / raw)
  To: Aldy Hernandez, Jakub Jelinek
  Cc: Jason Merrill, 'Jeff Law',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

Hi Aldy,
	I have answered your questions below. But, I am attaching the patch with the response to Jason Merrill's email. I am not attaching them here to remove unnecessary duplication. I hope that's OK with you. If you would like it otherwise please let me know and I can send them to you.

Thanks,

Balaji V. Iyer.

> -----Original Message-----
> From: Aldy Hernandez [mailto:aldyh@redhat.com]
> Sent: Thursday, January 16, 2014 4:19 PM
> To: Iyer, Balaji V; Jakub Jelinek
> Cc: Jason Merrill; 'Jeff Law'; 'gcc-patches@gcc.gnu.org'; 'rth@redhat.com'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> Here are a few things.
> 
> > +      if (g_expr.value && TREE_CODE (g_expr.value) ==
> C_MAYBE_CONST_EXPR)
> > +	{
> > +	  error_at (input_location, "cannot convert grain to long integer.\n");
> > +	  c_parser_skip_to_pragma_eol (parser);
> > +	}
> 
> Remove final period.  Also, where's the testcase?  Also, there seems to be
> spurious white space after the "}".
> 

Was not necessary. Fixed.

> Is it required that it be a long integer?  Because I see no further checks for
> this.
> 
Yes, I believe the language spec says the grainsize should be convertible to a long int.

> > +	  c_token *token = c_parser_peek_token (parser);
> > +	  if (token && token->type == CPP_KEYWORD
> > +	      && token->keyword == RID_CILK_FOR)
> 
> It doesn't look like c_parser_peek_token() ever returns NULL, so no need to
> check for token != 0.

Fixed.

> 
> > +	      tree grain = convert_to_integer (long_integer_type_node,
> > +					       g_expr.value);
> > +	      if (grain && grain != error_mark_node)
> > +		c_parser_cilk_simd (parser, grain);
> 
> No need to check grain != 0 here either.
> 

Fixed.

> > ==> a.c.003t.original <==
> >
> > ;; Function main (null)
> > ;; enabled by -tree-original
> >
> >
> > {
> >   int i;
> >
> >     int i;
> >   <<< Unknown tree: cilk_for
> >   #pragma omp parallel
> >     {
> >       {
> 
> Found with -fdump-tree-all.  You should handle the cilk_for tree code in the
> pretty printers, and add corresponding test(s).
> 

I didn't fix this yet. I didn't understand how to add tests to check pretty-printers...


> 
> >  static void
> > -c_parser_cilk_simd (c_parser *parser)
> > +c_parser_cilk_simd (c_parser *parser, tree grain)
> 
> No documentation for grain.
> 


Fixed.

> > +  tree clauses = NULL_TREE;
> > +
> > +  if (!is_cilk_for)
> > +    clauses = c_parser_cilk_all_clauses (parser);
> > +  else
> > +    clauses = grain;
> 
> First set of clauses=NULL_TREE is useless.
> 

OK. Just my common practice of initialize all the variables when it is defined. I have removed it.

> >  static tree
> >  c_parser_omp_for_loop (location_t loc, c_parser *parser, enum
> tree_code code,
> > -		       tree clauses, tree *cclauses)
> > +		       tree clauses_or_grain, tree *cclauses)
> 
> Don't overload clauses and grainsize into one argument.  Add another
> argument.  Also, document said argument.
> 

OK. I have added a new parameter called grain and renamed the clauses_or_grain to clauses.

> > +/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
> > +
> > +static tree
> > +cilk_declare_looper (const char *name, tree type, enum built_in_function
> code)
> > +{
> 
> I think you should document that it's creating a suitable built-in, not
> just creating a FUNCTION_DECL.  Also, plesae document argument `code'.
> And call this function something more meaningful, like
> "cilkrts_decalre_builtin" or "cilk_declare_for_builtin", but definitely
> not looper :).
> 

Renamed to declare_cilk_for_builtin.

> > @@ -1192,6 +1199,9 @@ dump_gimple_omp_for (pretty_printer *buffer,
> gimple gs, int spc, int flags)
> >  	    case GE_EXPR:
> >  	      pp_greater_equal (buffer);
> >  	      break;
> > +	    case NE_EXPR:
> > +	      pp_string (buffer, "!=");
> > +	      break;
> 
> Thank you :).  That was probably my oversight on the pragma simd work.
> 

Your welcome.

> > @@ -6603,6 +6614,11 @@ gimplify_omp_for (tree *expr_p, gimple_seq
> *pre_p)
> >      }
> >
> >    for_body = NULL;
> > +  if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
> > +    {
> > +      tree it = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), 0);
> > +      gimplify_and_add (it, &for_pre_body);
> > +    }
> 
> And what Jason said for all the special casing for CILK_FOR in this
> function...
> 

This removed as mentioned in email to Jason.

> > +static inline void
> > +gimple_cilk_for_set_grain (tree grain, gimple gs)
> > +{
> > +  const gimple_statement_omp_for *omp_for_stmt =
> > +    as_a <gimple_statement_omp_for> (gs);
> > +  omp_for_stmt->iter[0].grain = grain;
> > +}
> 
> Can we leave the grainsize in the clause, or does it have to reside in
> an auxiliary data structure?  It seems weird as is, since you're only
> setting grain for the first element.  I think Jason mentioned something
> similar.
> 

Got rid of the grainsize functions and just kept it as a clause in _Cilk_for (as suggested by Jason).

> > +/* A structure with necessary elements from _Cilk_for statement.  This
> > +   struct. node is passed in to WALK_STMT_INFO->INFO.  */
> > +typedef struct cilk_for_information {
> 
> "{" should be in a separate line.
> 

Fixed.

> "for a Cilk_for statement", not "from Cilk_for".  No abbreviation on
> struct.  Either remove the period, or spell it out.  Also s/in to/to/.
> 
> I'm not a C++ expert, but my understanding was that in C++ you don't
> need a typedef to use the following structure by name
> (cilk_for_information).  So you can just declare "struct
> cilk_for_information {...}" and instantiate it with just
> "cilk_for_information some_instance".  If that's the case, get rid of
> typedef.
> 
> > +  if (flag_enable_cilkplus
> 
> BTW, weren't you going to change this to flag_cilkplus or something in
> some past follow-up?
> 

Yup. I was going to do it after I finish getting _Cilk_for into GCC.

> >   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
> >   fd->chunk_size = NULL_TREE;
> > +  if (flag_enable_cilkplus
> > +      && gimple_omp_for_kind (fd->for_stmt) ==
> GF_OMP_FOR_KIND_CILKFOR)
> > +    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
> 
> I believe most of the flag_enable_cilkplus checks in omp-low.c can be
> removed, especially the ones related to syntax.  You shouldn't be
> getting any cilk constructs this late if cilkplus was not enabled.  For
> that matter, we don't check flag_openmp in this file throughout, we only
> check for it at the gates.
> 

Yes, but it is kept as a safety measure. For some reason some one overwrites a bit or something weird happens (like someone setting the #define wrong or something similar), then the Cilk Plus flags will add another, albeit small, layer of protection.

> >   bool is_cilk_for = (flag_enable_cilkplus && outer_ctx
> > 		      && is_cilk_for_stmt (outer_ctx->stmt, &ind_var));
> 
> Although here you could probably leave it, since it would avoid
> traversing outer_ctx->stmt.  And speak of which, since you're checking
> flag_enable_cilkplus in the caller, why also check it in
> is_cilk_for_stmt?  Do it all in the callee IMO.
> 

Removed from is_cilk_for_stmt.

> Also, you should probably combine boths initializations of sched_kind
> into the same if:
> 
> if (gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
>    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
> else
>    fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
> 

These both can't be same. _Cilk_for does not rely on OMP's scheduling.

> > +/* Returns the type of the induction variable for the child function for
> > +   _Cilk_for and the types for _high and _low variables based on TYPE.  */
> 
> high, low, type, what?  I don't get it.  Can you rewrite the
> documentation to make it clearer what this does?
> 

Fixed.

> > +
> > +static tree
> > +cilk_for_check_loop_diff_type (tree type)
> > +{
> > +  if (type == integer_type_node)
> > +    return type;
> > +  else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node))
> > +    {
> > +      if (TYPE_UNSIGNED (type))
> > +	return uint32_type_node;
> > +      else
> > +	return integer_type_node;
> > +    }
> > +  else
> > +    {
> > +      if (TYPE_UNSIGNED (type))
> > +	return uint64_type_node;
> > +      else
> > +	return long_long_integer_type_node;
> > +    }
> > +  gcc_unreachable ();
> > +}
> 
> No need for a gcc_unreachable().  You have a final `else'; you'll
> clearly never reach the unreachable.

Removed.

> 
> >  static void
> > -create_omp_child_function (omp_context *ctx, bool task_copy)
> > +create_omp_child_function (omp_context *ctx, bool task_copy,
> > +			   bool is_cilk_for, tree cilk_var_type)
> >  {
> >    tree decl, type, name, t;
> > -
> > -  name = create_omp_child_function_name (task_copy);
> > +
> > +  name = create_omp_child_function_name (task_copy, is_cilk_for);
> 
> It looks like task_copy and is_cilk_for never coexist throughout.
> Perhaps this is a candidate for an enum?

I can do that, but doesn't enum store the same size as an int and so isn't 2 bools better space-wise?
> 
> > +  /* _Cilk_for's child function requires two extra parameters called
> > +     __low and __high that are set the by Cilk runtime when it calls this
> > +     function.  */
> > +  if (is_cilk_for)
> > +    {
> > +      t = build_decl (DECL_SOURCE_LOCATION (decl),
> > +		      PARM_DECL, get_identifier ("__high"), cilk_var_type);
> 
> Perhaps you should add a similar comment here too:
> 
Fixed. Pointed to the note.

> > +  else if (is_cilk_for)
> > +    type = build_function_type_list (void_type_node, ptr_type_node,
> > +				     cilk_var_type, cilk_var_type, NULL_TREE);
> 
> 
> > +	      /* In _Cilk_for, the increment, start and final values
> > +		 are stored in the clause inserted by gimplify_omp_for.
> > +		 This value is used by the child function to find the
> > +		 appropriate induction value function based on the
> > +		 high and low parameters of the child function.
> > +		 Now, we need to store the decl value expressions here so
> > +		 that we can easily access them.  */
> > +	      if (flag_enable_cilkplus
> > +		  && (is_cilk_loop_var (var, "__cilk_init")
> > +		      || is_cilk_loop_var (var, "__cilk_cond")
> > +		      || is_cilk_loop_var (var, "__cilk_incr")))
> > +		SET_DECL_VALUE_EXPR (var, x);
> 
> This looks weird.  I'll let Jakub and Jason comment on whether this is
> the correct place to put this information.  Can't you check somehow if
> the loop is a Cilk_for loop without having to look at the variable names
> themselves?
> 

That's not what I am doing here. I am setting the value chain expression for the init, cond and incr temporary variables that I have created.

> > +/* Returns true if T is a tree whose code is COMPONENT_REF and its field
> > +   matches D_F_NAME and the data argument matches D_ARG_NAME.  */
> > +
> > +static bool
> > +cilk_find_field_value (tree t, tree d_arg_name, tree d_f_name)
> > +{
> > +  if (TREE_CODE (t) == COMPONENT_REF)
> > +    {
> > +      tree arg = TREE_OPERAND (t, 0);
> > +      tree field = TREE_OPERAND (t, 1);
> > +      if (TREE_CODE (arg) == ADDR_EXPR || TREE_CODE (arg) == MEM_REF)
> > +	arg = TREE_OPERAND (arg, 0);
> > +      if (DECL_NAME (arg) && DECL_NAME (field)
> > +	  && !strcmp (IDENTIFIER_POINTER (d_arg_name),
> > +		      IDENTIFIER_POINTER (DECL_NAME (arg)))
> > +	  && !strcmp (IDENTIFIER_POINTER (d_f_name),
> > +		      IDENTIFIER_POINTER (DECL_NAME (field))))
> 
> Also weird.  Do we really need to look at the identifier pointer itself?
>   Again, I'll let Jason and/or Jakub comment.
> 
> I'll let Jakub comment on the functional parts of the omp-low.c parts,
> but I would prefer that a lot of big functions in omp-low.c that only
> pertain to Cilk Plus, be moved to a cilk specific file.  For example,
> expand_cilk_for_body() and helpers.
> 
> Aldy

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2014-01-19  4:50                                         ` Iyer, Balaji V
@ 2014-01-23 10:12                                           ` Jakub Jelinek
  2014-01-23 16:38                                             ` Iyer, Balaji V
  2014-01-24 19:28                                             ` Iyer, Balaji V
  0 siblings, 2 replies; 42+ messages in thread
From: Jakub Jelinek @ 2014-01-23 10:12 UTC (permalink / raw)
  To: Iyer, Balaji V
  Cc: Jason Merrill, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

On Sun, Jan 19, 2014 at 04:50:39AM +0000, Iyer, Balaji V wrote:
> I have answered your questions below. In addition to your changes, I have
> also fixed the issues Aldy pointed out and have answered his questions in
> that thread.  With this email I have attached two patches and 2
> change-logs (for C and C++).  I have also rebased these patches to the
> trunk revision (r206756)

Haven't looked at the patch yet, just applied and looked what it generates:

Say in cilk_fors.c -O2 -fcilkplus -std=c99 -fdump-tree-{original,gimple,omplower,ompexp}
I'm seeing in *.original dump:
    <<< Unknown tree: cilk_for
which suggests that tree-pretty-print.c doesn't handle CILK_FOR.

Much more important is what is seen in the *.gimple dump though:
           schedule(runtime,0) private(ii)
          _Cilk_for (ii = 0; ii <= 9; ii = ii + 1)
            {
              #pragma omp parallel shared(__cilk_incr.0) shared(__cilk_cond.2) shared(__cilk_init.1) shared(ii) shared(Array)
                {
                  Array[ii] = 1133;
                }
            }
Why do you put the parallel inside of _Cilk_for rather than the other way
around?  That looks just wrong.  That would represent runtime scheduling of
work across the parallel regions at the _Cilk_for, and then in each
iteration running the body in several threads concurrently.
You want the parallel around the _Cilk_for, and
gimple_omp_parallel_set_combined_p (parallel_stmt, true) so that you can
then handle it specially during omp lowering/expansion.

Also, the printing of _Cilk_for is weird, the clauses (with space before)
look really weird above the _Cilk_for when there is no #pragma or something
similar.  Perhaps print the clauses after _Cilk_for?
  _Cilk_for (ii = 0; ii <= 9; ii = ii + 1) schedule(runtime,0) private(ii)
?

	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2014-01-23 10:12                                           ` Jakub Jelinek
@ 2014-01-23 16:38                                             ` Iyer, Balaji V
  2014-01-24 19:41                                               ` Jakub Jelinek
  2014-01-24 19:28                                             ` Iyer, Balaji V
  1 sibling, 1 reply; 42+ messages in thread
From: Iyer, Balaji V @ 2014-01-23 16:38 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

Hi Jakub,

> -----Original Message-----
> From: Jakub Jelinek [mailto:jakub@redhat.com]
> Sent: Thursday, January 23, 2014 5:13 AM
> To: Iyer, Balaji V
> Cc: Jason Merrill; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'rth@redhat.com'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On Sun, Jan 19, 2014 at 04:50:39AM +0000, Iyer, Balaji V wrote:
> > I have answered your questions below. In addition to your changes, I
> > have also fixed the issues Aldy pointed out and have answered his
> > questions in that thread.  With this email I have attached two patches
> > and 2 change-logs (for C and C++).  I have also rebased these patches
> > to the trunk revision (r206756)
> 
> Haven't looked at the patch yet, just applied and looked what it generates:
> 
> Say in cilk_fors.c -O2 -fcilkplus -std=c99 -fdump-tree-
> {original,gimple,omplower,ompexp}
> I'm seeing in *.original dump:
>     <<< Unknown tree: cilk_for
> which suggests that tree-pretty-print.c doesn't handle CILK_FOR.
> 

OK. I will work on this and send you a patch soon.

> Much more important is what is seen in the *.gimple dump though:
>            schedule(runtime,0) private(ii)
>           _Cilk_for (ii = 0; ii <= 9; ii = ii + 1)
>             {
>               #pragma omp parallel shared(__cilk_incr.0) shared(__cilk_cond.2)
> shared(__cilk_init.1) shared(ii) shared(Array)
>                 {
>                   Array[ii] = 1133;
>                 }
>             }
> Why do you put the parallel inside of _Cilk_for rather than the other way
> around?  That looks just wrong.  That would represent runtime scheduling of
> work across the parallel regions at the _Cilk_for, and then in each iteration
> running the body in several threads concurrently.
> You want the parallel around the _Cilk_for, and
> gimple_omp_parallel_set_combined_p (parallel_stmt, true) so that you can
> then handle it specially during omp lowering/expansion.

	This is how I started to think of it at first, but then when I thought about it ... in _Cilk_for unlike the #pragma simd's for, the for statement - not the body - (e.g. "_Cilk_for (int ii = 0; ii < 10; ii++") doesn't really do anything nor does it belong in the child function. It is really mostly used to calculate the loop count and capture step-size and starting point.

	The child function has its own loop that will have a step size of 1 regardless of your step size. You use the step-size to find the correct spot. Let me give you an example:

_Cilk_for (int ii = 0; ii < 10; ii = ii  + 2)
{
	Array [ii] = 5;
}

This is translated to the following (assume grain is something that the user input):

data_ptr.start = 0;
data_ptr.end = 10;
data_ptr.step_size = 2;
__cilkrts_cilk_for_32 (child_function, &data_ptr, (10-0)/2, grain);

Child_function (void *data_ptr, int high, int low)
{
	for (xx = low; xx < high; xx++) 
	 {
		Tmp_var = (xx * data_ptr->step_size) + data_ptr->start;
		// Note: if the _Cilk_for was (ii = 9; ii >= 0; ii -= 2), we would have something like this:
		// Tmp_var = data_ptr->end - (xx * data_ptr->step_size)
		// The for-loop above won't change.  
		Array[Tmp_var] = 5;
	}
}

High and low are passed in by the runtime and thus their range can be any number (in this case between 1 and 5)

Now, if we model this like #pragma omp parallel for then all _Cilk_for statement will also be pulled into the child function. This can be circumvented (althrough I feel it is a bit convoluted) using the region->data_arg pointers to pass values back and forth and doing other adjustments in expand_omp_for, and it will work for C, but it gets problematic for STL. This is because during the gimplification, the vector.start() and vector.end () are replaced with the start and end integers and the translation is put in the child function and that messes things up in omp-low.c and the calculation for count in the parent function. Another thing also gets messed up for C++ which I can't recall off the top of my head.

On high-level, you can't think of _Cilk_for in terms of Open MP's for. These both are orthogonal technologies that produce parallel code with different starting points and assumptions. The reason why I modeled this way in the compiler is so that I can use OMP's  compiler-routines. Some things such as creating a child function, inserting *a* call to the library function are same and thus the routines to do those can be shared but the internals are very different and so this is why it  is modelled this way in the compiler.

> 
> Also, the printing of _Cilk_for is weird, the clauses (with space before) look
> really weird above the _Cilk_for when there is no #pragma or something
> similar.  Perhaps print the clauses after _Cilk_for?
>   _Cilk_for (ii = 0; ii <= 9; ii = ii + 1) schedule(runtime,0) private(ii) ?
> 

OK. I will work on this and send you a patch.

> 	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2014-01-23 10:12                                           ` Jakub Jelinek
  2014-01-23 16:38                                             ` Iyer, Balaji V
@ 2014-01-24 19:28                                             ` Iyer, Balaji V
  1 sibling, 0 replies; 42+ messages in thread
From: Iyer, Balaji V @ 2014-01-24 19:28 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

[-- Attachment #1: Type: text/plain, Size: 2607 bytes --]

Hi Jakub,
	I am attaching the latest patches with the pretty print fixes that you mentioned for Cilk_for. Nothing else has changed except that I have rebased with the latest trunk revision (r207047). I have tested on x86_64 on 32 and 64 bit mode and it passes all tests and does not disrupt any other tests.

	Please let me know if it is OK to trunk.

Thanks,

Balaji V. Iyer.

> -----Original Message-----
> From: Jakub Jelinek [mailto:jakub@redhat.com]
> Sent: Thursday, January 23, 2014 5:13 AM
> To: Iyer, Balaji V
> Cc: Jason Merrill; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'rth@redhat.com'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On Sun, Jan 19, 2014 at 04:50:39AM +0000, Iyer, Balaji V wrote:
> > I have answered your questions below. In addition to your changes, I
> > have also fixed the issues Aldy pointed out and have answered his
> > questions in that thread.  With this email I have attached two patches
> > and 2 change-logs (for C and C++).  I have also rebased these patches
> > to the trunk revision (r206756)
> 
> Haven't looked at the patch yet, just applied and looked what it generates:
> 
> Say in cilk_fors.c -O2 -fcilkplus -std=c99 -fdump-tree-
> {original,gimple,omplower,ompexp}
> I'm seeing in *.original dump:
>     <<< Unknown tree: cilk_for
> which suggests that tree-pretty-print.c doesn't handle CILK_FOR.
> 
> Much more important is what is seen in the *.gimple dump though:
>            schedule(runtime,0) private(ii)
>           _Cilk_for (ii = 0; ii <= 9; ii = ii + 1)
>             {
>               #pragma omp parallel shared(__cilk_incr.0) shared(__cilk_cond.2)
> shared(__cilk_init.1) shared(ii) shared(Array)
>                 {
>                   Array[ii] = 1133;
>                 }
>             }
> Why do you put the parallel inside of _Cilk_for rather than the other way
> around?  That looks just wrong.  That would represent runtime scheduling of
> work across the parallel regions at the _Cilk_for, and then in each iteration
> running the body in several threads concurrently.
> You want the parallel around the _Cilk_for, and
> gimple_omp_parallel_set_combined_p (parallel_stmt, true) so that you can
> then handle it specially during omp lowering/expansion.
> 
> Also, the printing of _Cilk_for is weird, the clauses (with space before) look
> really weird above the _Cilk_for when there is no #pragma or something
> similar.  Perhaps print the clauses after _Cilk_for?
>   _Cilk_for (ii = 0; ii <= 9; ii = ii + 1) schedule(runtime,0) private(ii) ?
> 
> 	Jakub

[-- Attachment #2: diff_c++.txt --]
[-- Type: text/plain, Size: 17359 bytes --]

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index e12a528..2c2410e 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -237,8 +237,8 @@ static void cp_parser_initial_pragma
 static tree cp_literal_operator_id
   (const char *);
 
-static void cp_parser_cilk_simd
-  (cp_parser *, cp_token *);
+static tree cp_parser_cilk_simd
+  (cp_parser *, cp_token *, tree);
 static bool cp_parser_omp_declare_reduction_exprs
   (tree, cp_parser *);
 static tree cp_parser_cilk_simd_vectorlength 
@@ -9380,6 +9380,18 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 	  statement = cp_parser_iteration_statement (parser, false);
 	  break;
 
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (cp_lexer_peek_token (parser->lexer)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      cp_lexer_consume_token (parser->lexer);
+	      statement = error_mark_node;
+	    }
+	  else
+	    statement = cp_parser_cilk_simd (parser, NULL, integer_zero_node);
+	  break;
+
 	case RID_BREAK:
 	case RID_CONTINUE:
 	case RID_RETURN:
@@ -28865,7 +28877,7 @@ cp_parser_omp_for_cond (cp_parser *parser, tree decl, enum tree_code code)
     case LE_EXPR:
       break;
     case NE_EXPR:
-      if (code == CILK_SIMD)
+      if (code == CILK_SIMD || code == CILK_FOR)
 	break;
       /* Fall through: OpenMP disallows NE_EXPR.  */
     default:
@@ -29190,11 +29202,18 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       bool add_private_clause = false;
       location_t loc;
 
-      if (!cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
+      if (code == CILK_SIMD
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
 	{
 	  cp_parser_error (parser, "for statement expected");
 	  return NULL;
 	}
+      if (code == CILK_FOR
+	  && !cp_lexer_next_token_is_keyword (parser->lexer, RID_CILK_FOR))
+	{
+	  cp_parser_error (parser, "_Cilk_for statement expected");
+	  return NULL;
+	}
       loc = cp_lexer_consume_token (parser->lexer)->location;
 
       if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
@@ -29203,13 +29222,26 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
       init = decl = real_decl = NULL;
       this_pre_body = push_stmt_list ();
 
+      if (code == CILK_FOR
+	  && cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC))
+	{
+	  error_at (cp_lexer_peek_token (parser->lexer)->location,
+		    "induction variable cannot be static");
+	  cp_lexer_consume_token (parser->lexer);
+	}
       add_private_clause
 	|= cp_parser_omp_for_loop_init (parser,
-					/*parsing_openmp=*/code != CILK_SIMD,
+					/*parsing_openmp=*/
+					(code != CILK_SIMD && code != CILK_FOR),
 					this_pre_body, for_block,
 					init, decl, real_decl);
 
-      cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
+      if (!cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON)
+	  && code == CILK_FOR)
+	{
+	  cp_parser_skip_to_end_of_statement (parser);
+	  cp_parser_consume_semicolon_at_end_of_statement (parser);
+	}
       if (this_pre_body)
 	{
 	  this_pre_body = pop_stmt_list (this_pre_body);
@@ -29367,17 +29399,30 @@ cp_parser_omp_for_loop (cp_parser *parser, enum tree_code code, tree clauses,
 
   /* Note that we saved the original contents of this flag when we entered
      the structured block, and so we don't need to re-save it here.  */
-  if (code == CILK_SIMD)
+  if (code == CILK_SIMD || code == CILK_FOR)
     parser->in_statement = IN_CILK_SIMD_FOR;
   else
     parser->in_statement = IN_OMP_FOR;
 
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = begin_omp_parallel ();
+    }
+
   /* Note that the grammar doesn't call for a structured block here,
      though the loop as a whole is a structured block.  */
   body = push_stmt_list ();
   cp_parser_statement (parser, NULL_TREE, false, NULL);
   body = pop_stmt_list (body);
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = finish_omp_parallel (NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
   if (declv == NULL_TREE)
     ret = NULL_TREE;
   else
@@ -31312,6 +31357,38 @@ cp_parser_initial_pragma (cp_token *first_token)
   cp_lexer_get_preprocessor_token (NULL, first_token);
 }
 
+/* Parses the grainsize pragma for the _Cilk_for statement.
+   Syntax:
+   #pragma cilk grainsize = <VALUE>.  */
+
+static void
+cp_parser_cilk_grainsize (cp_parser *parser, cp_token *pragma_tok)
+{
+  if (cp_parser_require (parser, CPP_EQ, RT_EQ))
+    {
+      tree exp = cp_parser_binary_expression (parser, false, false,
+                                              PREC_NOT_OPERATOR, NULL);
+      cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+      if (!exp || exp == error_mark_node)
+        {
+          error_at (pragma_tok->location, "invalid grainsize for _Cilk_for");
+          return;
+        }
+      cp_token *n_tok = cp_lexer_peek_token (parser->lexer);
+
+      /* Make sure the next token is _Cilk_for, it is invalid otherwise.  */
+      if (n_tok && n_tok->type == CPP_KEYWORD
+	  && n_tok->keyword == RID_CILK_FOR)
+	cp_parser_cilk_simd (parser, NULL, exp);
+      else
+	warning_at (cp_lexer_peek_token (parser->lexer)->location, 0,
+		    "%<#pragma cilk grainsize%> is not followed by "
+		    "%<_Cilk_for%>");
+      return;
+    }
+  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+}
+
 /* Normal parsing of a pragma token.  Here we can (and must) use the
    regular lexer.  */
 
@@ -31491,9 +31568,30 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
 		    "%<#pragma simd%> must be inside a function");
 	  break;
 	}
-      cp_parser_cilk_simd (parser, pragma_tok);
+      cp_parser_cilk_simd (parser, pragma_tok, NULL_TREE);
       return true;
 
+    case PRAGMA_CILK_GRAINSIZE:
+      if (context == pragma_external)
+        {
+          error_at (pragma_tok->location,
+                    "%<#pragma cilk grainsize%> must be inside a function");
+          break;
+        }
+
+      /* Ignore the pragma if Cilk Plus is not enabled.  */
+      if (flag_enable_cilkplus)
+        {
+          cp_parser_cilk_grainsize (parser, pragma_tok);
+          return true;
+        }
+      else
+        {
+          error_at (pragma_tok->location, "-fcilkplus must be enabled to use "
+                    "%<#pragma cilk grainsize%>");
+          break;
+	}
+      
     default:
       gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
       c_invoke_pragma_handler (id);
@@ -31811,31 +31909,64 @@ cp_parser_cilk_simd_all_clauses (cp_parser *parser, cp_token *pragma_token)
     return c_finish_cilk_clauses (clauses);
 }
 
-/* Main entry-point for parsing Cilk Plus <#pragma simd> for loops.  */
+/* Main entry-point for parsing Cilk Plus <#pragma simd> for and _Cilk_for
+   loops.  This function returns NULL_TREE whenever it is parsing the
+   #pragma simd's for because the caller does not check the return value.
+   _Cilk_for's caller checks this value and thus return error_mark_node
+   when errors happen and a valid value when things go as expected.  */
 
-static void
-cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token)
+static tree
+cp_parser_cilk_simd (cp_parser *parser, cp_token *pragma_token, tree grain)
 {
-  tree clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
+  bool is_cilk_for = !pragma_token ? true: false;
+  tree clauses = NULL_TREE;
+
+  if (!is_cilk_for)
+    clauses = cp_parser_cilk_simd_all_clauses (parser, pragma_token);
 
   if (clauses == error_mark_node)
-    return;
-  
-  if (cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
+    return NULL_TREE;
+
+  if (!is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_FOR))
     {
       error_at (cp_lexer_peek_token (parser->lexer)->location,
 		"for statement expected");
-      return;
+      return NULL_TREE;
+    }
+  if (is_cilk_for
+      && cp_lexer_next_token_is_not_keyword (parser->lexer, RID_CILK_FOR))
+    {
+      error_at (cp_lexer_peek_token (parser->lexer)->location,
+		"_Cilk_for statement expected");
+      return error_mark_node;
     }
 
   tree sb = begin_omp_structured_block ();
   int save = cp_parser_begin_omp_structured_block (parser);
-  tree ret = cp_parser_omp_for_loop (parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  tree ret = cp_parser_omp_for_loop (parser, code, clauses, NULL);
   if (ret)
     cpp_validate_cilk_plus_loop (OMP_FOR_BODY (ret));
+
+  /* For _Cilk_for statements, the grain value is stored in the same
+     location as clauses for OMP for inside a schedule clause.  */
+  if (is_cilk_for && ret)
+    { 
+      tree l = build_omp_clause (EXPR_LOCATION (grain),
+				 OMP_CLAUSE_SCHEDULE);
+      OMP_CLAUSE_SCHEDULE_KIND (l) = OMP_CLAUSE_SCHEDULE_RUNTIME;
+      OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (l) = grain;
+      OMP_CLAUSE_CHAIN (l) = NULL_TREE;
+      OMP_FOR_CLAUSES (ret) = l;
+    }
+
   cp_parser_end_omp_structured_block (parser, save);
-  add_stmt (finish_omp_structured_block (sb));
-  return;
+  tree stmt = finish_omp_structured_block (sb);
+  add_stmt (stmt);
+  if (is_cilk_for) 
+    return stmt;
+  return NULL_TREE;
 }
 
 /* Create an identifier for a generic parameter type (a synthesized
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 930ca29..2bb85a0 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13586,6 +13586,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
     case OMP_FOR:
     case OMP_SIMD:
     case CILK_SIMD:
+    case CILK_FOR:
     case OMP_DISTRIBUTE:
       {
 	tree clauses, body, pre_body;
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 3a8daca..bb64a06 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -6036,7 +6036,8 @@ finish_omp_task (tree clauses, tree body)
 static bool
 handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
 			       tree condv, tree incrv, tree *body,
-			       tree *pre_body, tree clauses)
+			       tree *pre_body, tree clauses,
+			       bool is_cilk_for)
 {
   tree diff, iter_init, iter_incr = NULL, last;
   tree incr_var = NULL, orig_pre_body, orig_body, c;
@@ -6056,6 +6057,7 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
     case GE_EXPR:
     case LT_EXPR:
     case LE_EXPR:
+    case NE_EXPR:
       if (TREE_OPERAND (cond, 1) == iter)
 	cond = build2 (swap_tree_comparison (TREE_CODE (cond)),
 		       TREE_TYPE (cond), iter, TREE_OPERAND (cond, 0));
@@ -6199,6 +6201,11 @@ handle_omp_for_class_iterator (int i, location_t locus, tree declv, tree initv,
       break;
 
   decl = create_temporary_var (TREE_TYPE (diff));
+  /* In _Cilk_for we must know the induction variable name since it is
+     read by expand_cilk_for_body in omp-low.c to set the induction
+     variable in the child function correctly.  */
+  if (is_cilk_for)
+    DECL_NAME (decl) = make_anon_name ();
   pushdecl (decl);
   add_decl_expr (decl);
   last = create_temporary_var (TREE_TYPE (diff));
@@ -6414,8 +6421,24 @@ finish_omp_for (location_t locus, enum tree_code code, tree declv, tree initv,
 				"iteration variable %qE", decl);
 	      return NULL;
 	    }
-	  if (handle_omp_for_class_iterator (i, locus, declv, initv, condv,
-					     incrv, &body, &pre_body, clauses))
+
+	  /* In _Cilk_for, all the iterator mapping code should be
+	     inserted in the OMP_PARALLEL_BODY.  */
+	  if (code == CILK_FOR)
+	    {
+	      tree the_body = OMP_PARALLEL_BODY (body);
+	      if (TREE_CODE (the_body) == BIND_EXPR)
+		the_body = BIND_EXPR_BODY (the_body);
+	      if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						 condv, incrv, &the_body,
+						 &pre_body, clauses, true))
+		return NULL;
+	      else
+		BIND_EXPR_BODY (OMP_PARALLEL_BODY (body)) = the_body;
+	    }
+	  else if (handle_omp_for_class_iterator (i, locus, declv, initv,
+						  condv, incrv, &body,
+						  &pre_body, clauses, false))
 	    return NULL;
 	  continue;
 	}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
new file mode 100644
index 0000000..8221371
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/cilk-for-tplt.cc
@@ -0,0 +1,25 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#define SIZE 100
+#define CHECK_VALUE 5
+
+template <class T>
+int func (T start, T end)
+{
+  int Array[SIZE];
+  _Cilk_for (T ii = 0; ii < end; ii++)
+    Array[ii] = CHECK_VALUE;
+  
+  for (T ii = 0; ii < end; ii++)
+    if (Array[ii] != CHECK_VALUE)
+      __builtin_abort ();
+
+  return 0;
+}
+
+int main (void)
+{
+  return func <int> (0, 100) + func <long> (0, 100);
+}
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
new file mode 100644
index 0000000..2ac8c72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_iter.cc
@@ -0,0 +1,52 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array;
+vector <int> array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back (ii);
+}
+#endif
+_Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end();
+          iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+for (vector<int>::iterator iter = array_serial.begin(); 
+     iter != array_serial.end(); iter++)
+{
+   if (*iter  == 6) 
+     *iter = 13;
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
new file mode 100644
index 0000000..1cf3301
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_rev_iter.cc
@@ -0,0 +1,72 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <vector>
+#include <cstdio>
+#include <iostream>
+#include <algorithm>
+
+using namespace std;
+
+
+int main(void)
+{
+vector <int> array,array_serial;
+
+#if 1
+for (int ii = -1; ii < 10; ii++)
+{   
+  array.push_back(ii);
+  array_serial.push_back(ii);
+}
+#endif
+_Cilk_for (vector<int>::reverse_iterator iter4 = array.rbegin(); 
+	   iter4 != array.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+
+_Cilk_for (vector<int>::reverse_iterator iter4 = array_serial.rbegin(); 
+	   iter4 != array_serial.rend(); iter4++)
+{
+  if (*iter4 == 0x8) {
+    *iter4 = 9;
+  }
+}
+_Cilk_for (vector<int>::reverse_iterator iter2 = array.rbegin(); 
+	   iter2 != array.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+for (vector<int>::reverse_iterator iter2 = array_serial.rbegin(); 
+     iter2 != array_serial.rend();
+          iter2 += 1) 
+{
+   if ((*iter2 == 0x4) || (*iter2 == 0x7)) {
+    *iter2 = 0x3;
+   }
+}
+sort (array.begin(), array.end());
+sort (array_serial.begin(), array_serial.end());
+
+vector <int>::iterator iter = array.begin ();
+vector <int>::iterator iter_serial = array_serial.begin ();
+while (iter != array.end () && iter_serial != array_serial.end ())
+{
+  if (*iter != *iter_serial)
+    __builtin_abort ();
+  iter++;
+  iter_serial++;
+}
+
+return 0;
+}   
+
+
diff --git a/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
new file mode 100644
index 0000000..3e350a1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/CK/stl_test.cc
@@ -0,0 +1,50 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+#include <iostream>
+#include <cstdio>
+#include <cstdlib>
+#include <vector>
+#include <algorithm>
+#include <list>
+
+using namespace std;
+
+
+int main(int argc, char **argv)
+{
+  vector <int> number_list, number_list_serial;
+  int new_number = 0;
+  int no_elements = 0;
+  
+  if (argc != 2)
+  {
+    no_elements = 10000;
+  }
+
+
+  number_list.clear();
+  number_list_serial.clear();
+  for (int ii = 0; ii < no_elements; ii++)
+  {
+    number_list.push_back(new_number);
+    number_list_serial.push_back(new_number);
+  }
+
+  _Cilk_for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list[jj] = jj + no_elements;
+  }
+  for (int jj = 0; jj < no_elements; jj++)
+  {
+    number_list_serial[jj] = jj + no_elements;
+  }
+
+  for (int jj = 0; jj < no_elements; jj++)
+    if (number_list_serial[jj] != number_list[jj])
+      abort ();
+
+  return 0;
+}

[-- Attachment #3: diff_c.txt --]
[-- Type: text/plain, Size: 80802 bytes --]

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 86cec72..1caed53
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -414,6 +414,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Complex",		RID_COMPLEX,	0 },
   { "_Cilk_spawn",      RID_CILK_SPAWN, 0 },
   { "_Cilk_sync",       RID_CILK_SYNC,  0 },
+  { "_Cilk_for",        RID_CILK_FOR,   0 },
   { "_Imaginary",	RID_IMAGINARY, D_CONLY },
   { "_Decimal32",       RID_DFLOAT32,  D_CONLY | D_EXT },
   { "_Decimal64",       RID_DFLOAT64,  D_CONLY | D_EXT },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index d7077fd..96e4959 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -149,7 +149,7 @@ enum rid
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
   /* Cilk Plus keywords.  */
-  RID_CILK_SPAWN, RID_CILK_SYNC,
+  RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR,
   
   /* Objective-C ("AT" reserved words - they are only keywords when
      they follow '@')  */
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 4ce51e4..bb4f6a1 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -397,7 +397,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
   bool fail = false;
   int i;
 
-  if (code == CILK_SIMD
+  if ((code == CILK_SIMD || code == CILK_FOR) 
       && !c_check_cilk_loop (locus, TREE_VEC_ELT (declv, 0)))
     fail = true;
 
@@ -527,7 +527,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 					    0))
 		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
 					 ? LT_EXPR : GE_EXPR);
-		  else if (code != CILK_SIMD)
+		  else if (code != CILK_SIMD && code != CILK_FOR)
 		    cond_ok = false;
 		}
 	    }
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index af28085..6f22148 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1394,6 +1394,11 @@ init_pragma (void)
 
   cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false,
 				false);
+
+  if (flag_enable_cilkplus && !flag_preprocess_only)
+    cpp_register_deferred_pragma (parse_in, "cilk", "grainsize",
+				  PRAGMA_CILK_GRAINSIZE, true, false);
+
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, "pack", handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 6f1bf74..b9f09ba 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -55,6 +55,9 @@ typedef enum pragma_kind {
   /* Top level clause to handle all Cilk Plus pragma simd clauses.  */
   PRAGMA_CILK_SIMD,
 
+  /* This pragma handles setting of grainsize for a _Cilk_for.  */
+  PRAGMA_CILK_GRAINSIZE,
+
   PRAGMA_GCC_PCH_PREPROCESS,
   PRAGMA_IVDEP,
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index bbf5287..293b628
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1248,10 +1248,11 @@ static bool c_parser_objc_diagnose_bad_element_prefix
   (c_parser *, struct c_declspecs *);
 
 /* Cilk Plus supporting routines.  */
-static void c_parser_cilk_simd (c_parser *);
+static void c_parser_cilk_simd (c_parser *, tree);
 static bool c_parser_cilk_verify_simd (c_parser *, enum pragma_context);
 static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
 static tree c_parser_cilk_clause_vectorlength (c_parser *, tree, bool);
+static void c_parser_cilk_grainsize (c_parser *);
 
 /* Parse a translation unit (C90 6.7, C99 6.9).
 
@@ -4878,6 +4879,16 @@ c_parser_statement_after_labels (c_parser *parser)
 	case RID_FOR:
 	  c_parser_for_statement (parser, false);
 	  break;
+	case RID_CILK_FOR:
+	  if (!flag_enable_cilkplus)
+	    {
+	      error_at (c_parser_peek_token (parser)->location,
+			"-fcilkplus must be enabled to use %<_Cilk_for%>");
+	      c_parser_skip_to_end_of_block_or_statement (parser);
+	    }
+	  else
+	    c_parser_cilk_simd (parser, integer_zero_node);
+	  break;
 	case RID_CILK_SYNC:
 	  c_parser_consume_token (parser);
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
@@ -9488,7 +9499,25 @@ c_parser_pragma (c_parser *parser, enum pragma_context context)
       if (!c_parser_cilk_verify_simd (parser, context))
 	return false;
       c_parser_consume_pragma (parser);
-      c_parser_cilk_simd (parser);
+      c_parser_cilk_simd (parser, NULL_TREE);
+      return false;
+
+    case PRAGMA_CILK_GRAINSIZE:
+      if (!flag_enable_cilkplus)
+	{
+	  warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not"
+		   " enabled");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      if (context == pragma_external)
+	{
+	  error_at (c_parser_peek_token (parser)->location,
+		    "%<#pragma grainsize%> must be inside a function");
+	  c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
+	  return false;
+	}
+      c_parser_cilk_grainsize (parser);
       return false;
 
     default:
@@ -11583,7 +11612,7 @@ c_parser_omp_flush (c_parser *parser)
 
 static tree
 c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
-		       tree clauses, tree *cclauses)
+		       tree clauses, tree grain, tree *cclauses)
 {
   tree decl, cond, incr, save_break, save_cont, body, init, stmt, cl;
   tree declv, condv, incrv, initv, ret = NULL;
@@ -11591,6 +11620,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   int i, collapse = 1, nbraces = 0;
   location_t for_loc;
   vec<tree, va_gc> *for_block = make_tree_vector ();
+  tree top_body = NULL_TREE, top_level_body = NULL_TREE;
 
   for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl))
     if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE)
@@ -11603,11 +11633,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
   condv = make_tree_vec (collapse);
   incrv = make_tree_vec (collapse);
 
-  if (!c_parser_next_token_is_keyword (parser, RID_FOR))
+  if (code != CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_FOR))
     {
       c_parser_error (parser, "for statement expected");
       return NULL;
     }
+  if (code == CILK_FOR
+      && !c_parser_next_token_is_keyword (parser, RID_CILK_FOR))
+    {
+      c_parser_error (parser, "_Cilk_for statement expected");
+      return NULL;
+    }
   for_loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
 
@@ -11685,7 +11722,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	    case LE_EXPR:
 	      break;
 	    case NE_EXPR:
-	      if (code == CILK_SIMD)
+	      if (code == CILK_SIMD || code == CILK_FOR)
 		break;
 	      /* FALLTHRU.  */
 	    default:
@@ -11767,6 +11804,12 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
     c_break_label = size_one_node;
   save_cont = c_cont_label;
   c_cont_label = NULL_TREE;
+
+  if (code == CILK_FOR)
+    {
+      top_level_body = push_stmt_list ();
+      top_body = c_begin_omp_parallel ();
+    }
   body = push_stmt_list ();
 
   if (open_brace_parsed)
@@ -11814,6 +11857,13 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 	}
     }
 
+  if (code == CILK_FOR)
+    {
+      body = add_stmt (body);
+      body = c_finish_omp_parallel (loc, NULL_TREE, top_body);
+      body = pop_stmt_list (top_level_body);
+    }
+
   /* Only bother calling c_finish_omp_for if we haven't already generated
      an error from the initialization parsing.  */
   if (!fail)
@@ -11859,6 +11909,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
 		  }
 	    }
 	  OMP_FOR_CLAUSES (stmt) = clauses;
+	  /* If it is a _Cilk_for statement, then the OMP_FOR_CLAUSES location
+	     stores the user-defined grain value or an integer_zero_node 
+	     indicating that the runtime must compute a suitable grain.  */
+	  if (code == CILK_FOR)
+	    {
+	      tree l = build_omp_clause (EXPR_LOCATION (grain),
+					 OMP_CLAUSE_SCHEDULE);
+	      OMP_CLAUSE_SCHEDULE_KIND (l) = OMP_CLAUSE_SCHEDULE_RUNTIME;
+	      OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (l) = grain;
+	      OMP_CLAUSE_CHAIN (l) = NULL_TREE;
+	      OMP_FOR_CLAUSES (stmt) = l;
+	    }
 	}
       ret = stmt;
     }
@@ -11923,7 +11985,8 @@ c_parser_omp_simd (location_t loc, c_parser *parser,
     }
 
   block = c_begin_compound_stmt (true);
-  ret = c_parser_omp_for_loop (loc, parser, OMP_SIMD, clauses, cclauses);
+  ret = c_parser_omp_for_loop (loc, parser, OMP_SIMD, clauses, NULL_TREE,  
+			       cclauses);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 
@@ -12003,7 +12066,8 @@ c_parser_omp_for (location_t loc, c_parser *parser,
     }
 
   block = c_begin_compound_stmt (true);
-  ret = c_parser_omp_for_loop (loc, parser, OMP_FOR, clauses, cclauses);
+  ret = c_parser_omp_for_loop (loc, parser, OMP_FOR, clauses,  NULL_TREE, 
+			       cclauses);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 
@@ -12486,7 +12550,8 @@ c_parser_omp_distribute (location_t loc, c_parser *parser,
     }
 
   block = c_begin_compound_stmt (true);
-  ret = c_parser_omp_for_loop (loc, parser, OMP_DISTRIBUTE, clauses, NULL);
+  ret = c_parser_omp_for_loop (loc, parser, OMP_DISTRIBUTE, clauses, NULL_TREE,
+			       NULL);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 
@@ -13763,16 +13828,63 @@ c_parser_cilk_all_clauses (c_parser *parser)
   return c_finish_cilk_clauses (clauses);
 }
 
+/* This function helps parse the grainsize pragma for a _Cilk_for statement. 
+   Here is the correct syntax of this pragma: 
+	    #pragma cilk grainsize = <EXP>  */
+
+static void
+c_parser_cilk_grainsize (c_parser *parser)
+{
+  extern tree convert_to_integer (tree, tree);
+
+  /* consume the 'grainsize' keyword.  */
+  c_parser_consume_pragma (parser);
+
+  if (c_parser_require (parser, CPP_EQ, "expected %<=%>"))
+    {
+      struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL);
+      if (g_expr.value && g_expr.value != error_mark_node)
+	{
+	  c_parser_skip_to_pragma_eol (parser);
+	  c_token *token = c_parser_peek_token (parser);
+	  if (token->type == CPP_KEYWORD && token->keyword == RID_CILK_FOR)
+	    {
+	      /* Remove the excess precision expression wrapper since
+		 we are going to convert it to long int.  */
+	      if (TREE_CODE (g_expr.value) == EXCESS_PRECISION_EXPR)
+		g_expr.value = TREE_OPERAND (g_expr.value, 0);
+	      tree grain = convert_to_integer (long_integer_type_node,
+					       g_expr.value);
+	      if (grain != error_mark_node) 
+		c_parser_cilk_simd (parser, grain);
+	    }
+	  else
+	    warning (0, "grainsize pragma is not followed by %<_Cilk_for%>");
+	}
+      else
+	c_parser_skip_to_pragma_eol (parser);
+    }
+  else
+    c_parser_skip_to_pragma_eol (parser);
+}
+
 /* Main entry point for parsing Cilk Plus <#pragma simd> for
-   loops.  */
+   loops.  Grain is the analogous to chunk-size that is passed in
+   by the user through a grainize pragma for _Cilk_for.  If the
+   value is zero, then the runtime computes an appropriate value.  */
 
 static void
-c_parser_cilk_simd (c_parser *parser)
+c_parser_cilk_simd (c_parser *parser, tree grain)
 {
-  tree clauses = c_parser_cilk_all_clauses (parser);
+  bool is_cilk_for = grain == NULL_TREE ? false : true;
+  tree clauses = NULL_TREE;
+  
+  if (!is_cilk_for)
+    clauses = c_parser_cilk_all_clauses (parser);
   tree block = c_begin_compound_stmt (true);
   location_t loc = c_parser_peek_token (parser)->location;
-  c_parser_omp_for_loop (loc, parser, CILK_SIMD, clauses, NULL);
+  enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD;
+  c_parser_omp_for_loop (loc, parser, code, clauses, grain, NULL);
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 }
diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def
index 9f3240a..bf319d5 100644
--- a/gcc/cilk-builtins.def
+++ b/gcc/cilk-builtins.def
@@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame")
 DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32")
+DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64")
diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c
index afe88c9..753e5f0 100644
--- a/gcc/cilk-common.c
+++ b/gcc/cilk-common.c
@@ -106,6 +106,27 @@ install_builtin (const char *name, tree fntype, enum built_in_function code,
   return fndecl;
 }
 
+/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME.  */
+
+static tree
+declare_cilk_for_builtin (const char *name, tree type,
+			  enum built_in_function code)
+{
+  tree cb, ft, fn;
+
+  cb = build_function_type_list (void_type_node,
+                                 ptr_type_node, type, type,
+                                 NULL_TREE);
+  cb = build_pointer_type (cb);
+  ft = build_function_type_list (void_type_node,
+                                 cb, ptr_type_node, type,
+                                 integer_type_node, NULL_TREE);
+  fn = install_builtin (name, ft, code, false);
+  TREE_NOTHROW (fn) = 0;
+
+  return fn;
+}
+
 /* Creates and initializes all the built-in Cilk keywords functions and three
    structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker.
    Detailed information about __cilkrts_stack_frame and
@@ -269,6 +290,15 @@ cilk_init_builtins (void)
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 
 					 fptr_fun, BUILT_IN_CILK_SAVE_FP,
 					 false);
+  /* __cilkrts_cilk_for_32 (...);  */
+  cilk_for_32_fndecl = declare_cilk_for_builtin ("__cilkrts_cilk_for_32",
+						 unsigned_intSI_type_node,
+						 BUILT_IN_CILK_FOR_32);
+  /* __cilkrts_cilk_for_64 (...);  */
+  cilk_for_64_fndecl = declare_cilk_for_builtin ("__cilkrts_cilk_for_64",
+						 unsigned_intDI_type_node,
+						 BUILT_IN_CILK_FOR_64);
+
 }
 
 /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR.  */
diff --git a/gcc/cilk.h b/gcc/cilk.h
index d2ae931..0e98998 100644
--- a/gcc/cilk.h
+++ b/gcc/cilk.h
@@ -40,6 +40,9 @@ enum cilk_tree_index  {
   CILK_TI_F_POP,                      /* __cilkrts_pop_frame (...).  */
   CILK_TI_F_RETHROW,                  /* __cilkrts_rethrow (...).  */
   CILK_TI_F_SAVE_FP,                  /* __cilkrts_save_fp_ctrl_state (...).  */
+  CILK_TI_F_LOOP_32,                  /* __cilkrts_cilk_for_32 (...).  */
+  CILK_TI_F_LOOP_64,                  /* __cilkrts_cilk_for_64 (...).  */
+
   /* __cilkrts_stack_frame struct fields.  */
   CILK_TI_FRAME_FLAGS,                /* stack_frame->flags.  */
   CILK_TI_FRAME_PARENT,               /* stack_frame->parent.  */
@@ -77,6 +80,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX];
 #define cilk_rethrow_fndecl           cilk_trees[CILK_TI_F_RETHROW]
 #define cilk_pop_fndecl               cilk_trees[CILK_TI_F_POP]
 #define cilk_save_fp_fndecl           cilk_trees[CILK_TI_F_SAVE_FP]
+#define cilk_for_32_fndecl            cilk_trees[CILK_TI_F_LOOP_32]
+#define cilk_for_64_fndecl            cilk_trees[CILK_TI_F_LOOP_64]
 
 #define cilk_worker_type_fndecl       cilk_trees[CILK_TI_WORKER_TYPE]
 #define cilk_frame_type_decl          cilk_trees[CILK_TI_FRAME_TYPE]
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 2d1e1c7..59e9e2f 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -45,6 +45,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "value-prof.h"
 #include "trans-mem.h"
 
+static void dump_gimple_omp_parallel (pretty_printer *, gimple, int, int,
+				      bool);
 #define INDENT(SPACE)							\
   do { int i; for (i = 0; i < SPACE; i++) pp_space (buffer); } while (0)
 
@@ -1124,12 +1126,17 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_DISTRIBUTE:
 	  kind = " distribute";
 	  break;
+	case GF_OMP_FOR_KIND_CILKFOR:
+	  gcc_assert (flag_enable_cilkplus);
+	  kind = "";
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
       dump_gimple_fmt (buffer, spc, flags, "%G%s <%+BODY <%S>%nCLAUSES <", gs,
 		       kind, gimple_omp_body (gs));
-      dump_omp_clauses (buffer, gimple_omp_for_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_for_clauses (gs), spc, flags,
+			gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR);
       dump_gimple_fmt (buffer, spc, flags, " >,");
       for (i = 0; i < gimple_omp_for_collapse (gs); i++)
 	dump_gimple_fmt (buffer, spc, flags,
@@ -1158,16 +1165,24 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	case GF_OMP_FOR_KIND_DISTRIBUTE:
 	  pp_string (buffer, "#pragma omp distribute");
 	  break;
+	case GF_OMP_FOR_KIND_CILKFOR:
+	  gcc_assert (flag_enable_cilkplus);
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
-      dump_omp_clauses (buffer, gimple_omp_for_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_for_clauses (gs), spc, flags,
+			gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR);
       for (i = 0; i < gimple_omp_for_collapse (gs); i++)
 	{
 	  if (i)
 	    spc += 2;
 	  newline_and_indent (buffer, spc);
-	  pp_string (buffer, "for (");
+	  if (flag_enable_cilkplus 
+	      && gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
+	    pp_string (buffer, "_Cilk_for (");
+	  else
+	    pp_string (buffer, "for (");
 	  dump_generic_node (buffer, gimple_omp_for_index (gs, i), spc,
 			     flags, false);
 	  pp_string (buffer, " = ");
@@ -1192,6 +1207,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
 	    case GE_EXPR:
 	      pp_greater_equal (buffer);
 	      break;
+	    case NE_EXPR:
+	      pp_string (buffer, "!=");
+	      break;
 	    default:
 	      gcc_unreachable ();
 	    }
@@ -1211,11 +1229,18 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags)
       if (!gimple_seq_empty_p (gimple_omp_body (gs)))
 	{
 	  newline_and_indent (buffer, spc + 2);
-	  pp_left_brace (buffer);
-	  pp_newline (buffer);
-	  dump_gimple_seq (buffer, gimple_omp_body (gs), spc + 4, flags);
-	  newline_and_indent (buffer, spc + 2);
-	  pp_right_brace (buffer);
+	  if (flag_enable_cilkplus 
+	      && gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
+	      dump_gimple_omp_parallel (buffer, gimple_omp_body (gs),
+					spc + 4, flags, true);
+	  else 
+	    { 
+	      pp_left_brace (buffer); 
+	      pp_newline (buffer); 
+	      dump_gimple_seq (buffer, gimple_omp_body (gs), spc + 4, flags); 
+	      newline_and_indent (buffer, spc + 2); 
+	      pp_right_brace (buffer);
+	    }
 	}
     }
 }
@@ -1253,13 +1278,15 @@ dump_gimple_omp_single (pretty_printer *buffer, gimple gs, int spc, int flags)
     {
       dump_gimple_fmt (buffer, spc, flags, "%G <%+BODY <%S>%nCLAUSES <", gs,
 		       gimple_omp_body (gs));
-      dump_omp_clauses (buffer, gimple_omp_single_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_single_clauses (gs), spc, flags,
+			false);
       dump_gimple_fmt (buffer, spc, flags, " >");
     }
   else
     {
       pp_string (buffer, "#pragma omp single");
-      dump_omp_clauses (buffer, gimple_omp_single_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_single_clauses (gs), spc, flags,
+			false);
       if (!gimple_seq_empty_p (gimple_omp_body (gs)))
 	{
 	  newline_and_indent (buffer, spc + 2);
@@ -1296,14 +1323,16 @@ dump_gimple_omp_target (pretty_printer *buffer, gimple gs, int spc, int flags)
     {
       dump_gimple_fmt (buffer, spc, flags, "%G%s <%+BODY <%S>%nCLAUSES <", gs,
 		       kind, gimple_omp_body (gs));
-      dump_omp_clauses (buffer, gimple_omp_target_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_target_clauses (gs), spc, flags,
+			false);
       dump_gimple_fmt (buffer, spc, flags, " >");
     }
   else
     {
       pp_string (buffer, "#pragma omp target");
       pp_string (buffer, kind);
-      dump_omp_clauses (buffer, gimple_omp_target_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_target_clauses (gs), spc, flags,
+			false);
       if (gimple_omp_target_child_fn (gs))
 	{
 	  pp_string (buffer, " [child fn: ");
@@ -1332,13 +1361,15 @@ dump_gimple_omp_teams (pretty_printer *buffer, gimple gs, int spc, int flags)
     {
       dump_gimple_fmt (buffer, spc, flags, "%G <%+BODY <%S>%nCLAUSES <", gs,
 		       gimple_omp_body (gs));
-      dump_omp_clauses (buffer, gimple_omp_teams_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_teams_clauses (gs), spc, flags,
+			false);
       dump_gimple_fmt (buffer, spc, flags, " >");
     }
   else
     {
       pp_string (buffer, "#pragma omp teams");
-      dump_omp_clauses (buffer, gimple_omp_teams_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_teams_clauses (gs), spc, flags,
+			false);
       if (!gimple_seq_empty_p (gimple_omp_body (gs)))
 	{
 	  newline_and_indent (buffer, spc + 2);
@@ -1361,7 +1392,8 @@ dump_gimple_omp_sections (pretty_printer *buffer, gimple gs, int spc,
     {
       dump_gimple_fmt (buffer, spc, flags, "%G <%+BODY <%S>%nCLAUSES <", gs,
 		       gimple_omp_body (gs));
-      dump_omp_clauses (buffer, gimple_omp_sections_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_sections_clauses (gs), spc, flags,
+			false);
       dump_gimple_fmt (buffer, spc, flags, " >");
     }
   else
@@ -1374,7 +1406,8 @@ dump_gimple_omp_sections (pretty_printer *buffer, gimple gs, int spc,
 			     flags, false);
 	  pp_greater (buffer);
 	}
-      dump_omp_clauses (buffer, gimple_omp_sections_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_sections_clauses (gs), spc, flags,
+			false);
       if (!gimple_seq_empty_p (gimple_omp_body (gs)))
 	{
 	  newline_and_indent (buffer, spc + 2);
@@ -1846,13 +1879,14 @@ dump_gimple_phi (pretty_printer *buffer, gimple phi, int spc, bool comment,
 
 static void
 dump_gimple_omp_parallel (pretty_printer *buffer, gimple gs, int spc,
-                          int flags)
+                          int flags, bool is_cilk_for)
 {
   if (flags & TDF_RAW)
     {
       dump_gimple_fmt (buffer, spc, flags, "%G <%+BODY <%S>%nCLAUSES <", gs,
                        gimple_omp_body (gs));
-      dump_omp_clauses (buffer, gimple_omp_parallel_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_parallel_clauses (gs), spc, flags,
+			false);
       dump_gimple_fmt (buffer, spc, flags, " >, %T, %T%n>",
                        gimple_omp_parallel_child_fn (gs),
                        gimple_omp_parallel_data_arg (gs));
@@ -1860,8 +1894,12 @@ dump_gimple_omp_parallel (pretty_printer *buffer, gimple gs, int spc,
   else
     {
       gimple_seq body;
-      pp_string (buffer, "#pragma omp parallel");
-      dump_omp_clauses (buffer, gimple_omp_parallel_clauses (gs), spc, flags);
+      if (is_cilk_for) 
+	pp_string (buffer, "compiler-inserted clauses for cilk-for body: ");
+      else
+	pp_string (buffer, "#pragma omp parallel");
+      dump_omp_clauses (buffer, gimple_omp_parallel_clauses (gs), spc, flags,
+			false);
       if (gimple_omp_parallel_child_fn (gs))
 	{
 	  pp_string (buffer, " [child fn: ");
@@ -1906,7 +1944,8 @@ dump_gimple_omp_task (pretty_printer *buffer, gimple gs, int spc,
     {
       dump_gimple_fmt (buffer, spc, flags, "%G <%+BODY <%S>%nCLAUSES <", gs,
                        gimple_omp_body (gs));
-      dump_omp_clauses (buffer, gimple_omp_task_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_task_clauses (gs), spc, flags,
+			false);
       dump_gimple_fmt (buffer, spc, flags, " >, %T, %T, %T, %T, %T%n>",
                        gimple_omp_task_child_fn (gs),
                        gimple_omp_task_data_arg (gs),
@@ -1918,7 +1957,8 @@ dump_gimple_omp_task (pretty_printer *buffer, gimple gs, int spc,
     {
       gimple_seq body;
       pp_string (buffer, "#pragma omp task");
-      dump_omp_clauses (buffer, gimple_omp_task_clauses (gs), spc, flags);
+      dump_omp_clauses (buffer, gimple_omp_task_clauses (gs), spc, flags,
+			false);
       if (gimple_omp_task_child_fn (gs))
 	{
 	  pp_string (buffer, " [child fn: ");
@@ -2137,7 +2177,7 @@ pp_gimple_stmt_1 (pretty_printer *buffer, gimple gs, int spc, int flags)
       break;
 
     case GIMPLE_OMP_PARALLEL:
-      dump_gimple_omp_parallel (buffer, gs, spc, flags);
+      dump_gimple_omp_parallel (buffer, gs, spc, flags, false);
       break;
 
     case GIMPLE_OMP_TASK:
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 0e80d2e..194045c 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -91,13 +91,14 @@ enum gf_mask {
     GF_CALL_ALLOCA_FOR_VAR	= 1 << 5,
     GF_CALL_INTERNAL		= 1 << 6,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
-    GF_OMP_FOR_KIND_MASK	= 3 << 0,
+    GF_OMP_FOR_KIND_MASK	= 7 << 0,
     GF_OMP_FOR_KIND_FOR		= 0 << 0,
     GF_OMP_FOR_KIND_DISTRIBUTE	= 1 << 0,
     GF_OMP_FOR_KIND_SIMD	= 2 << 0,
     GF_OMP_FOR_KIND_CILKSIMD	= 3 << 0,
-    GF_OMP_FOR_COMBINED		= 1 << 2,
-    GF_OMP_FOR_COMBINED_INTO	= 1 << 3,
+    GF_OMP_FOR_KIND_CILKFOR     = 4 << 0,
+    GF_OMP_FOR_COMBINED		= 1 << 3,
+    GF_OMP_FOR_COMBINED_INTO	= 1 << 4,
     GF_OMP_TARGET_KIND_MASK	= 3 << 0,
     GF_OMP_TARGET_KIND_REGION	= 0 << 0,
     GF_OMP_TARGET_KIND_DATA	= 1 << 0,
@@ -4563,6 +4564,16 @@ gimple_omp_for_set_pre_body (gimple gs, gimple_seq pre_body)
   omp_for_stmt->pre_body = pre_body;
 }
 
+/* Returns the induction variable of type TREE from GS that is of type 
+   GIMPLE_STATEMENT_OMP_FOR.  */
+
+static inline tree
+gimple_cilk_for_induction_var (const_gimple gs)
+{
+  const gimple_statement_omp_for *cilk_for_stmt =
+    as_a <const gimple_statement_omp_for> (gs);
+  return cilk_for_stmt->iter->index;
+}
 
 /* Return the clauses associated with OMP_PARALLEL GS.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 9c9998d..c671c98
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6559,6 +6559,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
   bool simd;
   bitmap has_decl_expr = NULL;
 
+  tree orig_init = NULL_TREE, orig_cond = NULL_TREE, orig_incr = NULL_TREE;
   orig_for_stmt = for_stmt = *expr_p;
 
   simd = TREE_CODE (for_stmt) == OMP_SIMD
@@ -6678,6 +6679,11 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       else
 	var = decl;
 
+      /* Original initial, final and increment values are necessary to compute
+	 the loop-count.  Otherwise, they are stored in variables and their
+	 context could be changed, potentially making it impossible to compute
+	 them correctly.  */
+      orig_init = TREE_OPERAND (t, 1);
       tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
 			    is_gimple_val, fb_rvalue);
       ret = MIN (ret, tret);
@@ -6689,10 +6695,22 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gcc_assert (COMPARISON_CLASS_P (t));
       gcc_assert (TREE_OPERAND (t, 0) == decl);
 
-      tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-			    is_gimple_val, fb_rvalue);
-      ret = MIN (ret, tret);
-
+      /* Make sure the original end-value is saved un-touched for _Cilk_for.
+	 In C++ templates, it modifies the condition value but keeps the
+	 init. value the same.  Since we are using both to compute loop-count
+	 we need to keep them both in the original condtion.  */
+      if (flag_enable_cilkplus && TREE_CODE (for_stmt) == CILK_FOR)
+	{
+	  int x = 1;
+	  orig_cond = TREE_OPERAND (t, 1);
+	  copy_tree_r (&orig_cond, &x, NULL);
+	}
+      else
+	{
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL, 
+				is_gimple_val, fb_rvalue);
+	  ret = MIN (ret, tret);
+	}
       /* Handle OMP_FOR_INCR.  */
       t = TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i);
       switch (TREE_CODE (t))
@@ -6713,6 +6731,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	    t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	    t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	    TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	    orig_incr = build_one_cst (TREE_TYPE (t));
 	    break;
 	  }
 
@@ -6726,6 +6745,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
 	  t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t);
 	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	  orig_incr = build_one_cst (TREE_TYPE (t));
 	  break;
 
 	case MODIFY_EXPR:
@@ -6753,8 +6773,15 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	      gcc_unreachable ();
 	    }
 
-	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL,
-				is_gimple_val, fb_rvalue);
+	  orig_incr = TREE_OPERAND (t, 1);
+	  /* Extract the absolute value of the increment.  */
+	  if (TREE_CODE (t) == MINUS_EXPR
+	      || TREE_CODE  (TREE_OPERAND (t, 1)) == NEGATE_EXPR
+	      || (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST
+		  && tree_int_cst_sgn (TREE_OPERAND (t, 1)) < 1))
+	    orig_incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), orig_incr);
+	  tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body,
+				NULL, is_gimple_val, fb_rvalue);
 	  ret = MIN (ret, tret);
 	  if (c)
 	    {
@@ -6802,8 +6829,60 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 
   BITMAP_FREE (has_decl_expr);
 
+  /* Save the original, final and step-size value into a variable.  */
+  tree incr_val = NULL_TREE, init_val = NULL_TREE, cond_val = NULL_TREE;
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      tree stmt_list = alloc_stmt_list ();
+      incr_val = create_tmp_var (TREE_TYPE (orig_incr), "__cilk_incr");
+      tree mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_incr), incr_val,
+			 orig_incr);
+      append_to_statement_list (mod, &stmt_list);
+
+      init_val = create_tmp_var (TREE_TYPE (orig_init), "__cilk_init");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_init), init_val, orig_init);
+      append_to_statement_list (mod, &stmt_list);
+
+      cond_val = create_tmp_var (TREE_TYPE (orig_cond), "__cilk_cond");
+      mod = build2 (MODIFY_EXPR, TREE_TYPE (orig_cond), cond_val, orig_cond);
+      append_to_statement_list (mod, &stmt_list);
+  
+      gimplify_and_add (stmt_list, &for_pre_body);
+    }
   gimplify_and_add (OMP_FOR_BODY (orig_for_stmt), &for_body);
 
+  /* Set the variables holding initial, final and step-size as shared and
+     insert them as clauses.  */
+  if (TREE_CODE (orig_for_stmt) == CILK_FOR)
+    {
+      /* Sometimes an assign is inserted before the OMP_FOR_BODY.  So,
+	 search and find the omp for body.  */
+      gimple for_body_stmt = NULL;
+      for (gimple_stmt_iterator gsi = gsi_start (for_body); !gsi_end_p (gsi);
+	   gsi_next (&gsi))
+	{
+	  for_body_stmt = gsi_stmt (gsi);
+	  if (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL)
+	    break;
+	}
+      gcc_assert (gimple_code (for_body_stmt) == GIMPLE_OMP_PARALLEL);
+      tree orig_clses = gimple_omp_parallel_clauses (for_body_stmt);
+      tree new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = init_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = cond_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      orig_clses = new_clause;
+      new_clause = build_omp_clause (input_location, OMP_CLAUSE_SHARED);
+      OMP_CLAUSE_DECL (new_clause) = incr_val;
+      OMP_CLAUSE_CHAIN (new_clause) = orig_clses;
+
+      gimple_omp_parallel_set_clauses (for_body_stmt, new_clause);
+    }
   if (orig_for_stmt != for_stmt)
     for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
       {
@@ -6825,6 +6904,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
     case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break;
     case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break;
     case CILK_SIMD: kind = GF_OMP_FOR_KIND_CILKSIMD; break;
+    case CILK_FOR: kind = GF_OMP_FOR_KIND_CILKFOR; break;
     case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break;
     default:
       gcc_unreachable ();
@@ -7897,6 +7977,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	case OMP_FOR:
 	case OMP_SIMD:
 	case CILK_SIMD:
+	case CILK_FOR:
 	case OMP_DISTRIBUTE:
 	  ret = gimplify_omp_for (expr_p, pre_p);
 	  break;
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 5a09b33..c3cea3d 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "tree-nested.h"
 #include "tree-eh.h"
+#include "cilk.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -198,6 +199,13 @@ struct omp_for_data
   struct omp_for_data_loop *loops;
 };
 
+/* A structure with necessary elements from _Cilk_for statement.  This
+   structure node is passed to WALK_STMT_INFO->INFO.  */
+struct cilk_for_info
+{
+  bool found;
+  tree induction_var;
+};
 
 static splay_tree all_contexts;
 static int taskreg_nesting_level;
@@ -314,6 +322,9 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
   fd->have_ordered = false;
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
   fd->chunk_size = NULL_TREE;
+  if (flag_enable_cilkplus 
+      && gimple_omp_for_kind (fd->for_stmt) ==  GF_OMP_FOR_KIND_CILKFOR)
+    fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR;
   collapse_iter = NULL;
   collapse_count = NULL;
 
@@ -328,7 +339,13 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	break;
       case OMP_CLAUSE_SCHEDULE:
 	gcc_assert (!distribute);
-	fd->sched_kind = OMP_CLAUSE_SCHEDULE_KIND (t);
+	/* In _Cilk_for the sched kind doesn't make sense since we have
+	   our own scheduling.  This check is done to make sure we do not
+	   hit the asserts given below since we are purposely setting
+	   the sched_kind and the chunk size to hold the grain.  */
+	if (flag_enable_cilkplus
+	    && gimple_omp_for_kind (fd->for_stmt) != GF_OMP_FOR_KIND_CILKFOR)
+	  fd->sched_kind = OMP_CLAUSE_SCHEDULE_KIND (t);
 	fd->chunk_size = OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (t);
 	break;
       case OMP_CLAUSE_DIST_SCHEDULE:
@@ -391,8 +408,10 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd,
 	case GT_EXPR:
 	  break;
 	case NE_EXPR:
-	  gcc_assert (gimple_omp_for_kind (for_stmt)
-		      == GF_OMP_FOR_KIND_CILKSIMD);
+	  gcc_assert ((gimple_omp_for_kind (for_stmt)
+		       == GF_OMP_FOR_KIND_CILKSIMD)
+		      || (gimple_omp_for_kind (for_stmt)
+			  == GF_OMP_FOR_KIND_CILKFOR));
 	  break;
 	case LE_EXPR:
 	  if (POINTER_TYPE_P (TREE_TYPE (loop->n2)))
@@ -897,7 +916,31 @@ use_pointer_for_field (tree decl, omp_context *shared_ctx)
 	 variable no longer really shared.  */
       if (shared_ctx->is_nested)
 	{
-	  omp_context *up;
+	  omp_context *up = shared_ctx->outer;
+
+	  /* If VAR is the induction variable of the outer _Cilk_for, then
+	     it needs to be passed as a value not pointer since it
+	     would not be overwritten by the body.  */
+	  if (flag_enable_cilkplus
+	      && gimple_code (up->stmt) == GIMPLE_OMP_FOR
+	      && gimple_omp_for_kind (up->stmt) == GF_OMP_FOR_KIND_CILKFOR) 
+	    while (up) 
+	      { 
+		if (gimple_code (up->stmt) == GIMPLE_OMP_FOR
+		    && gimple_omp_for_kind (up->stmt)
+		    == GF_OMP_FOR_KIND_CILKFOR)
+		  {
+		    struct omp_for_data fd;
+		    /* _Cilk_for always has collapse = 1.  */
+		    struct omp_for_data_loop *loops
+		      = (struct omp_for_data_loop *)
+		      alloca (sizeof (struct omp_for_data_loop));
+		    extract_omp_for_data (up->stmt, &fd, loops);
+		    if (DECL_NAME (decl) == DECL_NAME (fd.loop.v))
+		      return false;
+		  }
+		up = up->outer;
+	      }
 
 	  for (up = shared_ctx->outer; up; up = up->outer)
 	    if (is_taskreg_ctx (up) && maybe_lookup_decl (decl, up))
@@ -1818,27 +1861,107 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	scan_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx);
 }
 
-/* Create a new name for omp child function.  Returns an identifier.  */
+/* Create a new name for omp child function.  Returns an identifier.  If 
+   IS_CILK_FOR is true then the suffix for the child function is 
+   "_cilk_for_fn."  */
 
 static tree
-create_omp_child_function_name (bool task_copy)
+create_omp_child_function_name (bool task_copy, bool is_cilk_for)
 {
+  if (is_cilk_for)
+    return clone_function_name (current_function_decl, "_cilk_for_fn");
   return (clone_function_name (current_function_decl,
 			       task_copy ? "_omp_cpyfn" : "_omp_fn"));
 }
 
+/* Helper function for walk_gimple_seq function.  *GSI_P is the gimple stmt.
+   iterator passed by walk_gimple_seq and *WI->INFO holds the CILK_FOR_INFO
+   structure.  This function sets the values inside this structure if it
+   finds a _Cilk_for statement in *GSI_P.  HANDLED_OPS_P is unused.  */
+
+static tree
+find_cilk_for_stmt (gimple_stmt_iterator *gsi_p,
+		    bool *handled_ops_p ATTRIBUTE_UNUSED,
+		    struct walk_stmt_info *wi)
+{
+  struct cilk_for_info *cf_info = (struct cilk_for_info *) wi->info;
+  gimple stmt = gsi_stmt (*gsi_p);
+
+  if (gimple_code (stmt) == GIMPLE_OMP_FOR
+      && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_CILKFOR)
+      /* For nested _Cilk_for statments, just look into the
+	 outer-most one.  */
+      && cf_info->found == false)
+    {
+      cf_info->found = true;
+      cf_info->induction_var = gimple_cilk_for_induction_var (stmt);
+    }
+  return NULL_TREE;
+}
+
+/* Returns true if STMT contains a CILK_FOR statement.  If found then
+   set *IND_VAR with induction variable.  Otherwise these values remain 
+   untouched.  IND_VAR can be NULL and if so then it is left untouched.  */
+
+static bool
+is_cilk_for_stmt (gimple stmt, tree *ind_var)
+{
+  gimple_seq body = stmt;
+  struct walk_stmt_info wi;
+  struct cilk_for_info cf_info;
+  memset (&cf_info, 0, sizeof (struct cilk_for_info));
+  memset (&wi, 0, sizeof (wi));
+  wi.info = &cf_info;
+  walk_gimple_seq (body, find_cilk_for_stmt, NULL, &wi);
+  if (cf_info.found)
+    {
+      if (ind_var)
+	*ind_var = cf_info.induction_var;
+      return true;
+    }
+    
+  return false;
+}
+
+/* Returns the type of the induction variable based on TYPE.  */
+
+static tree
+cilk_for_check_loop_diff_type (tree type)
+{
+  if (type == integer_type_node)
+    return type;
+  else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node))
+    { 
+      if (TYPE_UNSIGNED (type)) 
+	return uint32_type_node;
+      else
+	return integer_type_node;
+    }
+  else
+    {
+      if (TYPE_UNSIGNED (type)) 
+	return uint64_type_node;
+      else
+	return long_long_integer_type_node;
+    }
+}
+
 /* Build a decl for the omp child function.  It'll not contain a body
    yet, just the bare decl.  */
 
 static void
-create_omp_child_function (omp_context *ctx, bool task_copy)
+create_omp_child_function (omp_context *ctx, bool task_copy,
+			   bool is_cilk_for, tree cilk_var_type)
 {
   tree decl, type, name, t;
-
-  name = create_omp_child_function_name (task_copy);
+ 
+  name = create_omp_child_function_name (task_copy, is_cilk_for);
   if (task_copy)
     type = build_function_type_list (void_type_node, ptr_type_node,
 				     ptr_type_node, NULL_TREE);
+  else if (is_cilk_for)
+    type = build_function_type_list (void_type_node, ptr_type_node,
+				     cilk_var_type, cilk_var_type, NULL_TREE);
   else
     type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
 
@@ -1888,6 +2011,34 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_CONTEXT (t) = decl;
   DECL_RESULT (decl) = t;
 
+  /* _Cilk_for's child function requires two extra parameters called 
+     __low and __high that are set the by Cilk runtime when it calls this 
+     function.  Please refer to the note in expand_cilk_for_body for
+     explanation of __high and __low parameters.  */
+  if (is_cilk_for)
+    {
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__high"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+
+      t = build_decl (DECL_SOURCE_LOCATION (decl),
+		      PARM_DECL, get_identifier ("__low"), cilk_var_type);
+      DECL_ARTIFICIAL (t) = 1;
+      DECL_NAMELESS (t) = 1;
+      DECL_ARG_TYPE (t) = ptr_type_node;
+      DECL_CONTEXT (t) = current_function_decl;
+      TREE_USED (t) = 1;
+      TREE_ADDRESSABLE (t) = 1;
+      DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
+      DECL_ARGUMENTS (decl) = t;
+    }
   t = build_decl (DECL_SOURCE_LOCATION (decl),
 		  PARM_DECL, get_identifier (".omp_data_i"), ptr_type_node);
   DECL_ARTIFICIAL (t) = 1;
@@ -1895,6 +2046,8 @@ create_omp_child_function (omp_context *ctx, bool task_copy)
   DECL_ARG_TYPE (t) = ptr_type_node;
   DECL_CONTEXT (t) = current_function_decl;
   TREE_USED (t) = 1;
+  if (is_cilk_for)
+    DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
   DECL_ARGUMENTS (decl) = t;
   if (!task_copy)
     ctx->receiver_decl = t;
@@ -2016,7 +2169,15 @@ scan_omp_parallel (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+
+  tree ind_var = NULL_TREE;
+  bool is_cilk_for = (flag_enable_cilkplus && outer_ctx
+		      && is_cilk_for_stmt (outer_ctx->stmt, &ind_var));
+  tree cilk_var_type =
+    (is_cilk_for ? cilk_for_check_loop_diff_type (TREE_TYPE (ind_var))
+     : NULL_TREE);
+
+  create_omp_child_function (ctx, false, is_cilk_for, cilk_var_type);
   gimple_omp_parallel_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_parallel_clauses (stmt), ctx);
@@ -2061,7 +2222,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
   DECL_ARTIFICIAL (name) = 1;
   DECL_NAMELESS (name) = 1;
   TYPE_NAME (ctx->record_type) = name;
-  create_omp_child_function (ctx, false);
+  create_omp_child_function (ctx, false, false, NULL_TREE);
   gimple_omp_task_set_child_fn (stmt, ctx->cb.dst_fn);
 
   scan_sharing_clauses (gimple_omp_task_clauses (stmt), ctx);
@@ -2074,7 +2235,7 @@ scan_omp_task (gimple_stmt_iterator *gsi, omp_context *outer_ctx)
       DECL_ARTIFICIAL (name) = 1;
       DECL_NAMELESS (name) = 1;
       TYPE_NAME (ctx->srecord_type) = name;
-      create_omp_child_function (ctx, true);
+      create_omp_child_function (ctx, true, false, NULL_TREE);
     }
 
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
@@ -2199,7 +2360,7 @@ scan_omp_target (gimple stmt, omp_context *outer_ctx)
   TYPE_NAME (ctx->record_type) = name;
   if (kind == GF_OMP_TARGET_KIND_REGION)
     {
-      create_omp_child_function (ctx, false);
+      create_omp_child_function (ctx, false, false, NULL_TREE);
       gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
     }
 
@@ -2993,6 +3154,15 @@ lower_rec_simd_input_clauses (tree new_var, omp_context *ctx, int &max_vf,
   return true;
 }
 
+/* Returns true if the variable name in DECL matches *NAME.  */
+
+static inline bool
+is_cilk_loop_var (tree decl, const char *name)
+{
+  return (DECL_NAME (decl) && !strncmp (IDENTIFIER_POINTER (DECL_NAME (decl)), 
+					name, strlen (name))); 
+}
+
 /* Generate code to implement the input clauses, FIRSTPRIVATE and COPYIN,
    from the receiver (aka child) side and initializers for REFERENCE_TYPE
    private variables.  Initialization statements go in ILIST, while calls
@@ -3245,6 +3415,18 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 	      SET_DECL_VALUE_EXPR (new_var, x);
 	      DECL_HAS_VALUE_EXPR_P (new_var) = 1;
 
+	      /* In _Cilk_for, the increment, start and final values
+		 are stored in the clause inserted by gimplify_omp_for.  
+		 This value is used by the child function to find the 
+		 appropriate induction value function based on the 
+		 high and low parameters of the child function.  
+		 Now, we need to store the decl value expressions here so 
+		 that we can easily access them.  */
+	      if (flag_enable_cilkplus 
+		  && (is_cilk_loop_var (var, "__cilk_init") 
+		      || is_cilk_loop_var (var, "__cilk_cond")
+		      || is_cilk_loop_var (var, "__cilk_incr"))) 
+		SET_DECL_VALUE_EXPR (var, x);
 	      /* ??? If VAR is not passed by reference, and the variable
 		 hasn't been initialized yet, then we'll get a warning for
 		 the store into the omp_data_s structure.  Ideally, we'd be
@@ -4628,6 +4810,252 @@ expand_omp_build_assign (gimple_stmt_iterator *gsi_p, tree to, tree from)
     }
 }
 
+/* Returns true if T is a tree whose code is COMPONENT_REF and its field
+   matches D_F_NAME and the data argument matches D_ARG_NAME.  */
+
+static bool
+cilk_find_field_value (tree t, tree d_arg_name, tree d_f_name)
+{
+  if (TREE_CODE (t) == COMPONENT_REF)
+    {
+      tree arg = TREE_OPERAND (t, 0);
+      tree field = TREE_OPERAND (t, 1);
+      if (TREE_CODE (arg) == ADDR_EXPR || TREE_CODE (arg) == MEM_REF)
+	arg = TREE_OPERAND (arg, 0);
+      if (DECL_NAME (arg) && DECL_NAME (field)
+	  && !strcmp (IDENTIFIER_POINTER (d_arg_name),
+		      IDENTIFIER_POINTER (DECL_NAME (arg)))
+	  && !strcmp (IDENTIFIER_POINTER (d_f_name),
+		      IDENTIFIER_POINTER (DECL_NAME (field)))) 
+	return true;
+    }
+  return false;
+}
+
+/* Find the COMPONENT_REF in all the basic blocks in REGION whose 
+   data-argument is DATA_ARG and field is FIELD and then replace that 
+   COMPONENT_REF value with NEW_VALUE, a VAR_DECL.  */
+
+static void
+cilk_for_find_component_expr (struct omp_region *region, tree data_arg,
+			      tree field, tree new_value)
+{
+  vec<basic_block> bbs;
+  basic_block bb;
+  unsigned ii;
+  tree new_val = NULL_TREE;
+  bbs.create (0);
+  gather_blocks_in_sese_region (region->entry, region->exit, &bbs);
+  /* No need to push the entry bb into BBS since it doesn't get inserted
+     into the child function.  */
+  
+  tree da_name = DECL_NAME (data_arg);
+  tree df_name = DECL_NAME (field);
+  FOR_EACH_VEC_ELT (bbs, ii, bb)    
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+	 gsi_next (&gsi))
+      {
+	gimple stmt = gsi_stmt (gsi);
+	if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	  for (unsigned jj = 1; jj < gimple_num_ops (stmt); jj++)
+	    {
+	      tree *op = gimple_op_ptr (stmt, jj);
+	      if (TREE_CODE (*op) == COMPONENT_REF
+		  && cilk_find_field_value (*op, da_name, df_name))
+		{    
+		  if (TREE_TYPE (*op) == TREE_TYPE (new_value))
+		    new_val = new_value;
+		  else
+		    {
+		      tree t = fold_convert (TREE_TYPE (*op), new_value);
+		      new_val =
+			force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+						  true, GSI_NEW_STMT);
+		    }
+		  gsi_insert_before (&gsi, gimple_build_assign (*op, new_val), 
+				     GSI_NEW_STMT);
+		  *op = new_val;
+		}
+	    }
+      }
+}
+
+/* Find the initial, final and increment values in BODY_STMT's clause
+   and store them in *INIT, *FINAL and *INCR parameters respectively.  */
+
+static void
+find_cilk_for_vars (gimple body_stmt, tree *init, tree *final, tree *incr)
+{
+  /* Initial, final and increment value all start with __cilk_init,
+     __cilk_cond and __cilk_incr, respectively.  These values are defined
+     in shared clause.  Thus, we search for those.  */
+  for (tree cc = gimple_omp_parallel_clauses (body_stmt); cc; 
+       cc = OMP_CLAUSE_CHAIN (cc))
+    if (OMP_CLAUSE_CODE (cc) == OMP_CLAUSE_SHARED)
+      {
+	tree decl = OMP_CLAUSE_DECL (cc);
+	if (is_cilk_loop_var (decl, "__cilk_incr"))
+	  { 
+	    *incr = decl;
+	    if (DECL_VALUE_EXPR (*incr))
+	      *incr = DECL_VALUE_EXPR (*incr);
+	  } 
+	else if (is_cilk_loop_var (decl, "__cilk_init"))
+	  { 
+	    *init = decl;
+	    if (DECL_VALUE_EXPR (*init))
+	      *init = DECL_VALUE_EXPR (*init);
+	  }
+	else if (is_cilk_loop_var (decl, "__cilk_cond"))
+	  { 
+	    *final = decl;
+	    if (DECL_VALUE_EXPR (*final))
+	      *final = DECL_VALUE_EXPR (*final);
+	  }
+      }
+}
+ 
+/* Expand the _Cilk_for body starting at REGION.  DATA_ARG, HIGH and LOW 
+   indicates data-argument, __high and __low parameters of the child 
+   function.  
+   Note: __high and __low are parameters passed in by the Cilk Runtime to 
+   indicate the start and end of section.  */
+
+static void
+expand_cilk_for_body (struct omp_region *region, tree data_arg,
+		      tree low, tree high)
+{
+  struct omp_for_data fd;
+  struct omp_for_data_loop *loops;
+  loops
+    = (struct omp_for_data_loop *)
+      alloca (gimple_omp_for_collapse (last_stmt (region->outer->entry))
+	      * sizeof (struct omp_for_data_loop));
+  extract_omp_for_data (last_stmt (region->outer->entry), &fd, loops);
+  region->sched_kind = fd.sched_kind;
+  basic_block entry_bb = region->entry;
+  
+  /* This is where the body is and the location where we must insert
+     the modification to the induction variable.  */
+  basic_block body_bb = single_succ (region->entry);
+  gimple entry_stmt = last_stmt (region->entry);
+  
+  /* Split the first basic block into two and put the initializer values
+     in the top one.  */
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  basic_block l1_bb = split_block (entry_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (l1_bb);
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd.loop.v));
+  tree ind_var = create_tmp_reg (type, "__cilk_ind_var");
+  tree t = fold_convert (type, low);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_NEW_STMT);
+  gimple stmt = gimple_build_assign (ind_var, fold_convert (type, t));
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  vec_alloc (region->ws_args, 2);
+  tree t1 = null_pointer_node;
+  tree t2 = build_fold_addr_expr (gimple_omp_parallel_child_fn (entry_stmt));
+  if (data_arg)
+    {
+      t1 = build_fold_addr_expr (gimple_omp_parallel_data_arg (entry_stmt));
+      gsi = gsi_start_bb (body_bb);
+      tree init = NULL_TREE, final_val = NULL_TREE, incr = NULL_TREE;
+      find_cilk_for_vars (entry_stmt, &init, &final_val, &incr);
+
+      tree step = fd.loop.step;
+      if (TREE_CODE (fd.loop.step) != INTEGER_CST)
+	step = incr;      
+      step = fold_convert (type, step);
+      if (TREE_CODE (step) == INTEGER_CST && tree_int_cst_sgn (step) < 1)
+	step = fold_build1_loc (UNKNOWN_LOCATION, NEGATE_EXPR, type, step);
+      
+      tree tmp = create_tmp_reg (type, NULL);
+      gsi_insert_before (&gsi, gimple_build_assign (tmp, step),
+			 GSI_NEW_STMT);
+      t = build2 (MULT_EXPR, type, ind_var, tmp);
+      tree tmp2 = create_tmp_reg (type, NULL);
+      gsi_insert_after (&gsi, gimple_build_assign (tmp2, t), GSI_NEW_STMT);
+
+      tmp = create_tmp_reg (type, NULL);
+      init = fold_convert (type, init);
+      tree init_tmp = force_gimple_operand_gsi
+	(&gsi, init, true, NULL_TREE, false, GSI_CONTINUE_LINKING); 
+
+      gsi_insert_after (&gsi, gimple_build_assign (tmp, init_tmp), 
+			GSI_NEW_STMT);
+      if (fd.loop.cond_code == GE_EXPR || fd.loop.cond_code == GT_EXPR) 
+	t = fold_build2 (MINUS_EXPR, type, tmp, tmp2);
+      else 
+	t = fold_build2 (PLUS_EXPR, type, tmp, tmp2);
+
+      t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false, 
+				    GSI_CONTINUE_LINKING);
+      tree tmp3 = create_tmp_reg (type, NULL);
+      gimple stmt = gimple_build_assign (tmp3, t);
+      gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+      cilk_for_find_component_expr (region, data_arg, fd.loop.v, tmp3);
+    }
+  region->ws_args->quick_push (t1);
+  region->ws_args->quick_push (t2);
+  
+  gsi = gsi_last_bb (l1_bb);
+  basic_block cond_bb = split_block (l1_bb, gsi_stmt (gsi))->dest;
+  single_succ_edge (l1_bb)->flags = EDGE_FALLTHRU;
+
+  gsi = gsi_last_bb (cond_bb);
+  t = fold_convert (type, high);
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false,
+				GSI_CONTINUE_LINKING);
+  t = build2 (LT_EXPR, boolean_type_node, ind_var, t);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p,
+		 NULL, NULL)
+      || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p,
+		    NULL, NULL))
+    {
+      gsi = gsi_for_stmt (stmt);
+      gimple_regimplify_operands (stmt, &gsi);
+    }
+
+  /* Insert incrementing of induction variable.  */
+  gsi = gsi_last_bb (body_bb);
+  t = build2 (PLUS_EXPR, type, ind_var, build_one_cst (type));
+  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+  gsi_insert_after (&gsi, gimple_build_assign (ind_var, t),
+		    GSI_CONTINUE_LINKING);
+  
+  basic_block exit_bb = region->exit;
+
+  gsi = gsi_last_bb (exit_bb);
+  basic_block last_bb = split_block (exit_bb, gsi_stmt (gsi))->dest;
+  
+  /* Remove the #pragma omp return.  */
+  gsi = gsi_last_bb (exit_bb);
+  gsi_remove (&gsi, true);
+  
+  gsi = gsi_last_bb (last_bb);
+  gsi_insert_before (&gsi, gimple_build_return (NULL), GSI_SAME_STMT);
+  
+  /* Now connect all the basic-blocks.  */
+  edge e = make_edge (cond_bb, last_bb, EDGE_FALSE_VALUE);
+  e->probability = REG_BR_PROB_BASE / 4;
+
+  edge e3 = find_edge (cond_bb, body_bb);
+  e3->probability = REG_BR_PROB_BASE * 3 / 4;
+  e3->flags = EDGE_TRUE_VALUE;
+  
+  edge e2 = find_edge (exit_bb, last_bb);
+  remove_edge (e2);
+  e2 = make_edge (exit_bb, cond_bb, EDGE_FALLTHRU);
+  e2->probability = 1;
+  region->exit = last_bb;
+}
+
 /* Expand the OpenMP parallel or task directive starting at REGION.  */
 
 static void
@@ -4640,6 +5068,7 @@ expand_omp_taskreg (struct omp_region *region)
   gimple entry_stmt, stmt;
   edge e;
   vec<tree, va_gc> *ws_args;
+  gimple parcopy_stmt = NULL;
 
   entry_stmt = last_stmt (region->entry);
   child_fn = gimple_omp_taskreg_child_fn (entry_stmt);
@@ -4648,6 +5077,16 @@ expand_omp_taskreg (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
+  /* The way _Cilk_for is constructed in the compiler is like making
+     the _Cilk_for statment a #pragma OMP for and the body of it is
+     enclosed in #pragma omp parallel.  In this routine, we handle
+     inserting the body into the child function and putting a loop around
+     it to go from low to high.  NOTE: Even though this is how the 
+     compiler breaks them, they do NOT function the same way.  */
+  bool is_cilk_for =
+    (flag_enable_cilkplus && region->outer
+     && is_cilk_for_stmt (last_stmt (region->outer->entry), NULL));
+    
   if (is_combined_parallel (region))
     ws_args = region->ws_args;
   else
@@ -4698,7 +5137,6 @@ expand_omp_taskreg (struct omp_region *region)
 	  basic_block entry_succ_bb = single_succ (entry_bb);
 	  gimple_stmt_iterator gsi;
 	  tree arg, narg;
-	  gimple parcopy_stmt = NULL;
 
 	  for (gsi = gsi_start_bb (entry_succ_bb); ; gsi_next (&gsi))
 	    {
@@ -4755,6 +5193,29 @@ expand_omp_taskreg (struct omp_region *region)
 	    }
 	}
 
+      /* Extract the __high and __low parameter from the function.  */
+      tree high_arg = NULL_TREE, low_arg = NULL_TREE;
+      if (is_cilk_for)
+	{
+	  for (tree ii_arg = DECL_ARGUMENTS (child_fn); ii_arg != NULL_TREE;
+	       ii_arg = TREE_CHAIN (ii_arg))
+	    {
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)),
+			   "__high"))
+		high_arg = ii_arg;
+	      if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__low"))
+		low_arg = ii_arg;
+	    }
+	  gcc_assert (high_arg);
+	  gcc_assert (low_arg);
+	  expand_cilk_for_body (region, gimple_get_lhs (parcopy_stmt),
+				low_arg, high_arg);
+
+	  /* A new BB is added to the end of EXIT_BB and thus it needs to be
+	     updated.  */
+	  exit_bb = region->exit;
+	}
+
       /* Declare local variables needed in CHILD_CFUN.  */
       block = DECL_INITIAL (child_fn);
       BLOCK_VARS (block) = vec2chain (child_cfun->local_decls);
@@ -4787,7 +5248,7 @@ expand_omp_taskreg (struct omp_region *region)
       single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
 
       /* Convert GIMPLE_OMP_RETURN into a RETURN_EXPR.  */
-      if (exit_bb)
+      if (exit_bb && !is_cilk_for)
 	{
 	  gsi = gsi_last_bb (exit_bb);
 	  gcc_assert (!gsi_end_p (gsi)
@@ -4861,11 +5322,16 @@ expand_omp_taskreg (struct omp_region *region)
       pop_cfun ();
     }
 
-  /* Emit a library call to launch the children threads.  */
-  if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
-    expand_parallel_call (region, new_bb, entry_stmt, ws_args);
-  else
-    expand_task_call (new_bb, entry_stmt);
+  /* In _Cilk_for, the call to the runtime function is inserted by
+     expand_omp_for.  */
+  if (!is_cilk_for)
+    {
+      /* Emit a library call to launch the children threads.  */
+      if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
+	expand_parallel_call (region, new_bb, entry_stmt, ws_args);
+      else
+	expand_task_call (new_bb, entry_stmt);
+    }
   if (gimple_in_ssa_p (cfun))
     update_ssa (TODO_update_ssa_only_virtuals);
 }
@@ -6540,6 +7006,127 @@ expand_omp_for_static_chunk (struct omp_region *region,
     }
 }
 
+/* A subroutine of expand_omp_for.  Insert the function call to the
+   cilk library function-call: __cilkrts_cilk_for_64/32 into the end of
+   REGION.  Loop information is calculated using step, n1 and n2 from FD.  */
+
+static void
+insert_cilk_for_fn_call (struct omp_region *region, struct omp_for_data *fd)
+{
+  tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+  basic_block entry_bb = region->entry;
+  bool broken_loop = region->cont == NULL;
+  basic_block cont_bb = region->cont;
+  gimple_stmt_iterator gsi = gsi_last_bb (entry_bb);
+  tree diff_type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v));
+
+  tree clauses = gimple_omp_for_clauses (fd->for_stmt);
+
+  tree grain = find_omp_clause (clauses, OMP_CLAUSE_SCHEDULE);
+  gcc_assert (grain != NULL_TREE);
+  grain = OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (grain);
+  
+  /* Convert n2 and n1 to the type we need.  */
+  tree n1 = fold_convert (diff_type, fd->loop.n1);
+  tree n2 = fold_convert (diff_type, fd->loop.n2);
+
+  n1 = force_gimple_operand_gsi (&gsi, n1, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  n2 = force_gimple_operand_gsi (&gsi, n2, true, NULL_TREE, true,
+				 GSI_SAME_STMT);
+  tree diff_val = fold_build2 (MINUS_EXPR, diff_type, n2, n1);
+
+  diff_val = force_gimple_operand_gsi (&gsi, diff_val, true, NULL_TREE,
+					    true, GSI_SAME_STMT);
+  tree step = fd->loop.step;
+  tree step_convert = force_gimple_operand_gsi (&gsi,
+						fold_convert (diff_type, step),
+						true, NULL_TREE, true,
+						GSI_SAME_STMT);
+  tree count = fold_build2 (TRUNC_DIV_EXPR, diff_type, diff_val, step_convert);
+  count = force_gimple_operand_gsi (&gsi, count, true, NULL_TREE, true,
+				    GSI_SAME_STMT);
+
+  tree data_arg_ptr = (*region->ws_args)[0];
+  tree child_fn = (*region->ws_args)[1];
+
+  tree lib_fun = NULL_TREE;
+  if (TYPE_PRECISION (type) == 32)
+    lib_fun = cilk_for_32_fndecl;
+  else if (TYPE_PRECISION (type) == 64)
+    lib_fun = cilk_for_64_fndecl;
+  else
+    gcc_unreachable ();
+
+  vec<tree, va_gc> *args;
+  vec_alloc (args, 4);
+  args->quick_push (child_fn);
+  args->quick_push (data_arg_ptr);
+  args->quick_push (count);
+  args->quick_push (grain);
+  tree t = build_call_expr_loc_vec (UNKNOWN_LOCATION, lib_fun, args);
+  gsi_remove (&gsi, true);
+
+  if (!broken_loop)
+    {
+      /* Code to control the increment goes in the CONT_BB.  */
+      gsi = gsi_last_bb (cont_bb);
+      gimple stmt = gsi_stmt (gsi);
+      gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE);
+      gsi_remove (&gsi, true);
+      
+      /* remove the edge to OMP continue block.  */
+      unsigned int ii = 0;
+      while (EDGE_COUNT (cont_bb->succs) > 1)
+	{
+	  edge ee = EDGE_SUCC (cont_bb, ii);
+	  if (!(ee->flags & EDGE_FALLTHRU))
+	    remove_edge (ee);
+	  ii++;
+	}      
+      gsi = gsi_start_bb (cont_bb);
+      gsi_remove (&gsi, true);
+      force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true,
+				GSI_CONTINUE_LINKING);
+    }
+
+  /* Remove GIMPLE_OMP_RETURN.  */
+  gsi = gsi_last_bb (region->exit);
+  gimple stmt = gsi_stmt (gsi);
+  gcc_assert (gimple_code (stmt) == GIMPLE_OMP_RETURN);
+  gsi_remove (&gsi, true);
+
+  gsi = gsi_last_bb (region->entry);
+  t = fold_build2 (fd->loop.cond_code, boolean_type_node, n1, n2);
+  stmt = gimple_build_cond_empty (t);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+
+  /* In here we are replacing a _Cilk_for statement with something
+     like this:
+
+     if (n1 <cond_code> n2)
+       goto bb1
+     else
+       goto bb2
+     
+     bb1:
+       .omp_data.o.__cilk_incr = __cilk_incr;
+       ...
+       __cilkrts_cilk_for_{32/64} (func_name, &omp_data_0, <count>, <grain>);
+
+     bb2:
+     clobber all values and go out.  */  
+  unsigned int ii = 0;
+  while (ii < EDGE_COUNT (region->entry->succs))
+    {
+      edge ee = EDGE_SUCC (region->entry, ii);
+      if (ee->flags & EDGE_FALLTHRU)
+	ee->flags = EDGE_TRUE_VALUE;
+      else
+	ee->flags = EDGE_FALSE_VALUE;
+      ii++;
+    }
+}
 
 /* A subroutine of expand_omp_for.  Generate code for a simd non-worksharing
    loop.  Given parameters:
@@ -6880,6 +7467,12 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt)
 
   if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_SIMD)
     expand_omp_simd (region, &fd);
+  else if (flag_enable_cilkplus 
+	   && (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_CILKFOR))
+    {
+      region->ws_args = region->inner->ws_args;
+      insert_cilk_for_fn_call (region, &fd);
+    }
   else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
 	   && !fd.have_ordered)
     {
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
new file mode 100644
index 0000000..a80f413
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c
@@ -0,0 +1,100 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+static void check (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start;  ii < end; ii = ii + incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+static void check_reverse (int *Array, int start, int end, int incr, int value)
+{
+  int ii = 0;
+  for (ii = start; ii >= end; ii = ii - incr)
+    if (Array[ii] != value)
+      __builtin_abort ();
+#if HAVE_IO
+  printf ("Passed\n");
+#endif
+}
+
+
+int main (void)
+{
+  int Array[10];
+  int x = 9, y = 0, z = 3;
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array[ii] = 1133;
+  check (Array, 0, 10, 1, 1133);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 9; ii > -1; ii--)
+    Array[ii] = 4433;
+  check_reverse (Array, 9, 0, 1, 4433);
+
+  _Cilk_for (int ii = 9; ii > -1; --ii)
+    Array[ii] = 9988;
+  check_reverse (Array, 9, 0, 1, 9988);
+
+  _Cilk_for (int ii = 0; ii < 10; ++ii)
+    Array[ii] = 3311;
+  check (Array, 0, 10, 1, 3311);
+
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    Array[ii] = 1328;
+  check (Array, 0, 10, 2, 1328);
+
+  _Cilk_for (int ii = 9; ii >= 0; ii -= 2)
+    Array[ii] = 1738;
+  check_reverse (Array, 9, 0, 2, 1738);
+
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    {
+      if (ii % 2)
+	Array[ii] = 1343;
+      else
+	Array[ii] = 3413;
+    }
+
+  check (Array, 1, 10, 2, 1343); 
+  check (Array, 0, 10, 2, 3413); 
+
+  _Cilk_for (short cc = 0; cc < 10; cc++) 
+    Array[cc] = 1343;
+  check (Array, 0, 10,  1,1343);
+
+  _Cilk_for (short cc = 9; cc >= 0; cc--)
+    Array[cc] = 1348;
+  check_reverse (Array, 9, 0, 1, 1348);
+
+
+
+  /* Loop with polynomials in _Cilk_for.  */
+  _Cilk_for (int ii = z - 3; ii <= z * 3; ii += 2)
+    { 
+      Array[ii] = 3233;
+    }
+
+  for (int ii = z-3; ii <= z*3; ii += 2)
+    if (Array[ii] != 3233)
+      __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
new file mode 100644
index 0000000..ed73c34
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+int main (void)
+{
+  int q = 0, ii = 0, jj = 0;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */
+    /* { dg-error "expected" "" { target c++ } 10 } */
+    q = 5;
+
+  _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */
+    q = 2;
+
+  _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++)  /* { dg-error "expected ';' before ',' token" "" { target c } } */
+    /* { dg-error "invalid controlling predicate" "" { target c++ }  20 } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */
+    q = 5;
+
+
+  _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected" } */ 
+    q = 5;
+
+  _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */
+    q = 5;
+
+ 
+  _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static" } */
+
+    q = 5;
+
+
+  _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */
+    q = 5;
+
+
+  _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */
+    q = 5;
+
+  _Cilk_for (int ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */
+    q = 5;
+
+  _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK!  */
+    q = 5;
+
+  _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" "" { target c } } */ 
+    /* { dg-error "expected" "" { target c++ }  53 } */
+    q = 5;
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
new file mode 100644
index 0000000..6cb9b03
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c
@@ -0,0 +1,35 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+int grain_value = 2;
+int main (void)
+{
+  int Array1[200], Array1_Serial[200];
+
+  for (int ii = 0; ii < 200; ii++)
+    {
+      Array1_Serial[ii] = 2;
+      Array1[ii] = 1;
+    }
+
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 200; ii++)
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+#pragma cilk grainsize = grain_value
+  _Cilk_for (int ii = 0; ii < 200; ii++) 
+    Array1[ii] = 2;
+
+  for (int ii = 0; ii < 200; ii++)
+    if (Array1[ii] != Array1_Serial[ii])
+      return (ii+1);
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
new file mode 100644
index 0000000..e1e3217
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-fcilkplus -Wunknown-pragmas" } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+
+char Array1[26];
+
+#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */
+
+int main(int argc, char **argv)
+{
+/* This is OK.  */
+#pragma cilk grainsize = 2
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+
+/* This is OK, it will do a type conversion to long int.  */
+#pragma cilk grainsize = 0.5 
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    Array1[ii] = 0;
+
+#pragma cilk grainsize = 1 
+  while (Array1[5] != 0) /* { dg-warning "is not followed by" } */
+    {
+    /* Blah */
+    }
+
+#pragma cilk grainsize = 1 
+  int q = 0; /* { dg-warning "is not followed by" } */
+  _Cilk_for (q = 0; q < 10; q++)
+    Array1[q]  = 5;
+
+  while (Array1[5] != 0)
+    {
+    /* Blah */
+    }
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
new file mode 100644
index 0000000..7a779f7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c
@@ -0,0 +1,41 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+
+
+/* <feature> loop control variable must have integer, pointer or class type
+   </feature>
+*/
+
+#define ARRAY_SIZE 10000
+int a[ARRAY_SIZE];
+
+int main(void)
+{ 
+  int ii = 0;
+
+#if 1
+  for (ii =0; ii < ARRAY_SIZE; ii++)
+    a[ii] = 5;
+#endif
+  _Cilk_for(int *aa = a; aa < a + ARRAY_SIZE; aa++) 
+    *aa = 0;
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii++) 
+    if (a[ii] != 0) 
+      __builtin_abort ();
+#endif
+
+  _Cilk_for (int *aa = a; aa < a + ARRAY_SIZE; aa = aa + 2)
+    *aa = 4;
+
+#if 1
+  for (ii = 0; ii < ARRAY_SIZE; ii = ii + 2) 
+    if (a[ii] != 4) 
+      __builtin_abort ();
+#endif
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
new file mode 100644
index 0000000..cffe17e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c
@@ -0,0 +1,79 @@
+/* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-fcilkplus" } */
+/* { dg-additional-options "-std=gnu99"  { target c } } */
+/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+
+#if HAVE_IO
+#include <stdio.h>
+#endif
+
+int main (void)
+{
+  int Array[10][10];
+
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj++)
+	{
+	  Array[ii][jj] = 0;
+	}
+
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 5; jj++)
+      Array[ii][jj] = 5;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 5; jj++)
+      if (Array[ii][jj] != 5)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+
+  /* One goes up and one goes down.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 9; jj >= 0; jj--)
+      Array[ii][jj] = 7;
+
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 9; jj >= 0; jj--)
+      if (Array[ii][jj] != 7)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii++)
+    _Cilk_for (int jj = 0; jj < 10; jj += 2)
+      Array[ii][jj] = 9;
+  
+  for (int ii = 0; ii < 10; ii++)
+    for (int jj = 0; jj < 10; jj += 2)
+      if (Array[ii][jj] != 9)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  /* different step sizes.  */
+  _Cilk_for (int ii = 0; ii < 10; ii += 2)
+    _Cilk_for (int jj = 5; jj < 9; jj++)
+      Array[ii][jj] = 11; 
+  
+  for (int ii = 0; ii < 10; ii += 2)
+    for (int jj = 5; jj < 9; jj++)
+      if (Array[ii][jj] != 11)
+#if HAVE_IO
+	printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]);
+#else
+	__builtin_abort ();
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index e548a0d..d8c14e3 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -351,6 +351,7 @@ enum omp_clause_schedule_kind {
   OMP_CLAUSE_SCHEDULE_GUIDED,
   OMP_CLAUSE_SCHEDULE_AUTO,
   OMP_CLAUSE_SCHEDULE_RUNTIME,
+  OMP_CLAUSE_SCHEDULE_CILKFOR,
   OMP_CLAUSE_SCHEDULE_LAST
 };
 
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 0595499..04f0cf9 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -296,7 +296,8 @@ dump_array_domain (pretty_printer *buffer, tree domain, int spc, int flags)
    dump_generic_node.  */
 
 static void
-dump_omp_clause (pretty_printer *buffer, tree clause, int spc, int flags)
+dump_omp_clause (pretty_printer *buffer, tree clause, int spc, int flags, 
+		 bool is_cilk_for)
 {
   const char *name;
 
@@ -393,6 +394,13 @@ dump_omp_clause (pretty_printer *buffer, tree clause, int spc, int flags)
       break;
 
     case OMP_CLAUSE_SCHEDULE:
+      if (is_cilk_for)
+	{ 
+	  pp_string (buffer, "grainsize = ");
+	  dump_generic_node (buffer, OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (clause),
+			     spc, flags, false);
+	  break;
+	}
       pp_string (buffer, "schedule(");
       switch (OMP_CLAUSE_SCHEDULE_KIND (clause))
 	{
@@ -644,7 +652,8 @@ dump_omp_clause (pretty_printer *buffer, tree clause, int spc, int flags)
    dump_generic_node.  */
 
 void
-dump_omp_clauses (pretty_printer *buffer, tree clause, int spc, int flags)
+dump_omp_clauses (pretty_printer *buffer, tree clause, int spc, int flags,
+		  bool is_cilk_for)
 {
   if (clause == NULL)
     return;
@@ -652,7 +661,7 @@ dump_omp_clauses (pretty_printer *buffer, tree clause, int spc, int flags)
   pp_space (buffer);
   while (1)
     {
-      dump_omp_clause (buffer, clause, spc, flags);
+      dump_omp_clause (buffer, clause, spc, flags, is_cilk_for);
       clause = OMP_CLAUSE_CHAIN (clause);
       if (clause == NULL)
 	return;
@@ -2360,7 +2369,7 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
 
     case OMP_PARALLEL:
       pp_string (buffer, "#pragma omp parallel");
-      dump_omp_clauses (buffer, OMP_PARALLEL_CLAUSES (node), spc, flags);
+      dump_omp_clauses (buffer, OMP_PARALLEL_CLAUSES (node), spc, flags, false);
 
     dump_omp_body:
       if (!(flags & TDF_SLIM) && OMP_BODY (node))
@@ -2377,7 +2386,7 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
 
     case OMP_TASK:
       pp_string (buffer, "#pragma omp task");
-      dump_omp_clauses (buffer, OMP_TASK_CLAUSES (node), spc, flags);
+      dump_omp_clauses (buffer, OMP_TASK_CLAUSES (node), spc, flags, false);
       goto dump_omp_body;
 
     case OMP_FOR:
@@ -2391,6 +2400,9 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
     case CILK_SIMD:
       pp_string (buffer, "#pragma simd");
       goto dump_omp_loop;
+    
+    case CILK_FOR:
+      goto dump_omp_loop;
 
     case OMP_DISTRIBUTE:
       pp_string (buffer, "#pragma omp distribute");
@@ -2398,27 +2410,30 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
 
     case OMP_TEAMS:
       pp_string (buffer, "#pragma omp teams");
-      dump_omp_clauses (buffer, OMP_TEAMS_CLAUSES (node), spc, flags);
+      dump_omp_clauses (buffer, OMP_TEAMS_CLAUSES (node), spc, flags, false);
       goto dump_omp_body;
 
     case OMP_TARGET_DATA:
       pp_string (buffer, "#pragma omp target data");
-      dump_omp_clauses (buffer, OMP_TARGET_DATA_CLAUSES (node), spc, flags);
+      dump_omp_clauses (buffer, OMP_TARGET_DATA_CLAUSES (node), spc, flags,
+			false);
       goto dump_omp_body;
 
     case OMP_TARGET:
       pp_string (buffer, "#pragma omp target");
-      dump_omp_clauses (buffer, OMP_TARGET_CLAUSES (node), spc, flags);
+      dump_omp_clauses (buffer, OMP_TARGET_CLAUSES (node), spc, flags, false);
       goto dump_omp_body;
 
     case OMP_TARGET_UPDATE:
       pp_string (buffer, "#pragma omp target update");
-      dump_omp_clauses (buffer, OMP_TARGET_UPDATE_CLAUSES (node), spc, flags);
+      dump_omp_clauses (buffer, OMP_TARGET_UPDATE_CLAUSES (node), spc, flags,
+			false);
       is_expr = false;
       break;
 
     dump_omp_loop:
-      dump_omp_clauses (buffer, OMP_FOR_CLAUSES (node), spc, flags);
+      dump_omp_clauses (buffer, OMP_FOR_CLAUSES (node), spc, flags, 
+			TREE_CODE (node) == CILK_FOR);
 
       if (!(flags & TDF_SLIM))
 	{
@@ -2440,7 +2455,10 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
 		{
 		  spc += 2;
 		  newline_and_indent (buffer, spc);
-		  pp_string (buffer, "for (");
+		  if (TREE_CODE (node) == CILK_FOR)
+		    pp_string (buffer, "_Cilk_for (");
+		  else 
+		    pp_string (buffer, "for (");
 		  dump_generic_node (buffer,
 				     TREE_VEC_ELT (OMP_FOR_INIT (node), i),
 				     spc, flags, false);
@@ -2457,13 +2475,29 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
 	    }
 	  if (OMP_FOR_BODY (node))
 	    {
-	      newline_and_indent (buffer, spc + 2);
-	      pp_left_brace (buffer);
-	      newline_and_indent (buffer, spc + 4);
-	      dump_generic_node (buffer, OMP_FOR_BODY (node), spc + 4, flags,
-		  false);
-	      newline_and_indent (buffer, spc + 2);
-	      pp_right_brace (buffer);
+	      tree b_node = OMP_FOR_BODY (node);
+	      if (TREE_CODE (node) == CILK_FOR)
+		{
+		  /* If we call dump_generic_node directly, then it will
+		     emit "#pragma omp parallel."  This is something that
+		     we inserted to take advantage of several OMP routines
+		     and conceptually there is NO #pragma omp parallel
+		     enclosing the _Cilk_for's body.  */
+		  newline_and_indent (buffer, spc + 4); 
+		  dump_omp_clauses (buffer, OMP_PARALLEL_CLAUSES (b_node),
+				    spc, flags, false);
+		  dump_generic_node (buffer, OMP_BODY (b_node), spc + 4,
+				     flags, false);
+		}
+	      else
+		{ 
+		  newline_and_indent (buffer, spc + 2); 
+		  pp_left_brace (buffer); 
+		  newline_and_indent (buffer, spc + 4); 
+		  dump_generic_node (buffer, b_node, spc + 4, flags, false); 
+		  newline_and_indent (buffer, spc + 2); 
+		  pp_right_brace (buffer);
+		} 
 	    }
 	  if (OMP_FOR_INIT (node))
 	    spc -= 2 * TREE_VEC_LENGTH (OMP_FOR_INIT (node)) - 2;
@@ -2479,7 +2513,7 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
 
     case OMP_SECTIONS:
       pp_string (buffer, "#pragma omp sections");
-      dump_omp_clauses (buffer, OMP_SECTIONS_CLAUSES (node), spc, flags);
+      dump_omp_clauses (buffer, OMP_SECTIONS_CLAUSES (node), spc, flags, false);
       goto dump_omp_body;
 
     case OMP_SECTION:
@@ -2546,11 +2580,11 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
 
     case OMP_SINGLE:
       pp_string (buffer, "#pragma omp single");
-      dump_omp_clauses (buffer, OMP_SINGLE_CLAUSES (node), spc, flags);
+      dump_omp_clauses (buffer, OMP_SINGLE_CLAUSES (node), spc, flags, false);
       goto dump_omp_body;
 
     case OMP_CLAUSE:
-      dump_omp_clause (buffer, node, spc, flags);
+      dump_omp_clause (buffer, node, spc, flags, false);
       is_expr = false;
       break;
 
diff --git a/gcc/tree-pretty-print.h b/gcc/tree-pretty-print.h
index d2ab0b7..b19c73d 100644
--- a/gcc/tree-pretty-print.h
+++ b/gcc/tree-pretty-print.h
@@ -39,7 +39,7 @@ extern void print_generic_decl (FILE *, tree, int);
 extern void print_generic_stmt (FILE *, tree, int);
 extern void print_generic_stmt_indented (FILE *, tree, int, int);
 extern void print_generic_expr (FILE *, tree, int);
-extern void dump_omp_clauses (pretty_printer *, tree, int, int);
+extern void dump_omp_clauses (pretty_printer *, tree, int, int, bool);
 extern int dump_generic_node (pretty_printer *, tree, int, int, bool);
 extern void print_declaration (pretty_printer *, tree, int, int);
 extern int op_code_prio (enum tree_code);
diff --git a/gcc/tree.def b/gcc/tree.def
index f8d6444..558d7c8 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1051,6 +1051,10 @@ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6)
    Operands like for OMP_FOR.  */
 DEFTREECODE (CILK_SIMD, "cilk_simd", tcc_statement, 6)
 
+/* Cilk Plus - _Cilk_for (..)
+   Operands like for OMP_FOR.  */
+DEFTREECODE (CILK_FOR, "cilk_for", tcc_statement, 6)
+
 /* OpenMP - #pragma omp distribute [clause1 ... clauseN]
    Operands like for OMP_FOR.  */
 DEFTREECODE (OMP_DISTRIBUTE, "omp_distribute", tcc_statement, 6)

[-- Attachment #4: cp-ChangeLog --]
[-- Type: application/octet-stream, Size: 1437 bytes --]

gcc/cp/ChangeLog
2014-01-24  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* parser.c (cp_parser_statement): Added a RID_CILK_FOR case.
	(cp_parser_omp_for_cond): Included a check for CILK_FOR along with
	CILK_SIMD.
	(cp_parser_omp_for_loop): Overall, added support to parse _Cilk_for
	statement along with omp for statements.
	(cp_parser_cilk_grainsize): New function.
	(cp_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(cp_parser_cilk_simd): Added a new parameter for grain.  Added support
	to handle _Cilk_for loops along with #pragma simd for loops.
	* pt.c (tsubst_expr): Added CILK_FOR case.
	* semantics.c (handle_omp_for_class_iterator): Added 2 new parameters.
	Added a NE_EXPR case.  Added a check for _Cilk_for statement and
	if so, then give a name for the new induction variable.
	(finish_omp_for): Added a check if the code is _Cilk_for and if true
	then insert all the iterator temporary variables into the _Cilk_for
	body.

gcc/testsuite/ChangeLog
2014-01-24  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Made certain error
	tags C specific and inserted their C++ equivalents.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* g++.dg/cilk-plus/CK/cilk-for-tplt.cc: New testcase.
	* g++.dg/cilk-plus/CK/stl_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_rev_iter.cc: Likewise.
	* g++.dg/cilk-plus/CK/stl_test.cc: Likewise.


[-- Attachment #5: c-ChangeLog --]
[-- Type: application/octet-stream, Size: 4848 bytes --]

gcc/ChangeLog
2014-01-24  Balaji V. Iyer  <balaji.v.iyer@intel.com>
  
	* cilk-builtins.def: Added two new builtin functions called
	__cilkrts_cilk_for_32 and __cilkrts_cilk_for_64.
	* cilk-common.c (cilk_init_builtins): Likewise.
	(cilk_declare_looper): New function.
	* cilk.h (enum cilk_tree_index): Added two new fields called
	CILK_TI_F_LOOP_32 and CILK_TI_F_LOOP_64.
	(cilk_for_32_fndecl): New #define.
	(cilk_for_64_fndecl): Likewise.
	* gimple-pretty-print.c (dump_gimple_omp_for): Added a new case for
	GF_OMP_FOR_KIND_CILKFOR.  Also emitted "_Cilk_for" instead of "for"
	when the gimple kind is GF_OMP_FOR_KIND_CILKFOR.
	(dump_gimple_omp_parallel): Added a new parameter: is_cilk_for.
	Printed a different string if is_cilk_for is true instead of
	"#pragma omp parallel."
	* tree-pretty-print.c (dump_omp_clause): Added a new parameter called
	is_cilk_for.  If it is set to true, then OMP_CLAUSE_SCHEDULE output
	is printed as grainsize.
	(dump_generic_node): Added a CILK_FOR case.  If the tree type is
	CILK_FOR then print "_Cilk_for" instead of "for."  Also do not insert
	the braces required for OMP's for.
	* tree-pretty-print.h (dump_omp_clauses): Added a new bool parameter.
	* gimple.h (enum gf_mask): Added a new field GF_OMP_FOR_KIND_CILKFOR.
	Re-arranged couple other fields to make them all in ascending order.
	(struct gimple_omp_for_iter): Added a new field called "grain."
	(gimple_cilk_for_set_grain): New function.
	(gimple_cilk_for_induction_var): Likewise.
	(gimple_cilk_for_grain): Likewise.
	* gimplify.c (gimplify_omp_for): Added code to handle gimplification
	of a _Cilk_for statement.
	* omp-low.c (struct cilk_for_information): New structure.
	(create_omp_child_function_name): Added a new bool parameter called
	is_cilk_for.  If this is set, then use a different suffix.
	(extract_omp_for_data): Added a check for _Cilk_for's kind for a
	NE_EXPR case.  Added the correct schedule type for _Cilk_for.
	(use_pointer_for_field): Reject using of pointers for the induction
	variable of the outer function.
	(find_cilk_for_stmt): New function.
	(is_cilk_for_stmt): Likewise.
	(cilk_for_check_loop_diff_type): Likewise.
	(expand_cilk_for_body): Likewise.
	(is_cilk_loop_var): Likewise.
	(cilk_find_field_value): Likewise.
	(cilk_find_component_expr): Likewise.
	(find_cilk_for_vars): Likewise.
	(insert_cilk_for_fn_call): Likewise.
	(create_omp_child_function): Added two new parameters to pass in
	whether it is a _Cilk_for body and the induction variable type.  If
	it is _Cilk_for, then create two new parameters and different function-
	type.
	(lower_rec_input_clauses): Set the new decl expr value to the
	variable for the "__cilk_init," "__cilk_cond" and "__cilk_incr"
	variables.
	(scan_omp_parallel): Added a check if the outer statement is a
	_Cilk_for and if so, then find the correct induction variable type to
	pass them into create_omp_child_function.
	(expand_omp_taskreg): Added code to extract the high and low parameters
	from the child function and then insert it in the appropriate location.
	Added a call to expand_cilk_for_body.  Allowed the insertion of the
	library calls when the taskreg being expanded is not a _Cilk_for.
	(expand_omp_for): Added a check for GF_OMP_FOR_KIND_CILKFOR for the
	for statement's kind.  If so then call insert_cilk_for_fn_call.
	* tree-core.h (enum omp_clause_schedule_kind): Added a new field
	OMP_CLAUSE_SCHEDULE_CILK_FOR.
	* tree.def (CILK_FOR): New tree.

gcc/c-family/ChangeLog
2014-01-24  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-omp.c (c_finish_omp_for): Added a check for CILK_FOR along with
	CILK_SIMD.
	* c-common.h (enum rid): Added new value called "RID_CILK_FOR."
	* c-common.c (c_common_reswords[]): Added a new field "_Cilk_for."
	* c-pragma.c (init_pragma): Added cilk grainsize pragma.
	* c-pragma.h (enum pragma_kind): Added PRAGMA_CILK_GRAINSIZE.

gcc/c/ChangeLog
2014-01-24  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-parser.c (c_parser_statement_after_labels): Added a RID_CILK_FOR
	case.
	(c_parser_pragma): Added a PRAGMA_CILK_GRAINSIZE case.
	(c_parser_omp_for_loop): Added new parameter called grain.
	Added handling for _Cilk_for statements.
	(c_parser_cilk_grainsize): New function.
	(c_parser_cilk_simd): Added a new parameter called grain.  Also added
	support to parse _Cilk_for statements.

gcc/testsuite/ChangeLog
2014-01-24  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* c-c++-common/cilk-plus/CK/cilk-fors.c: New testcase.
	* c-c++-common/cilk-plus/CK/nested_cilk_for.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_errors.c: Likewise.
	* c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c: Likewise.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] _Cilk_for for C and C++
  2014-01-23 16:38                                             ` Iyer, Balaji V
@ 2014-01-24 19:41                                               ` Jakub Jelinek
  2014-01-24 20:33                                                 ` Iyer, Balaji V
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Jelinek @ 2014-01-24 19:41 UTC (permalink / raw)
  To: Iyer, Balaji V
  Cc: Jason Merrill, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'

On Thu, Jan 23, 2014 at 04:38:53PM +0000, Iyer, Balaji V wrote:
> 	This is how I started to think of it at first, but then when I thought about it ... in _Cilk_for unlike the #pragma simd's for, the for statement - not the body - (e.g. "_Cilk_for (int ii = 0; ii < 10; ii++") doesn't really do anything nor does it belong in the child function. It is really mostly used to calculate the loop count and capture step-size and starting point.
> 
> 	The child function has its own loop that will have a step size of 1 regardless of your step size. You use the step-size to find the correct spot. Let me give you an example:
> 
> _Cilk_for (int ii = 0; ii < 10; ii = ii  + 2)
> {
> 	Array [ii] = 5;
> }
> 
> This is translated to the following (assume grain is something that the user input):
> 
> data_ptr.start = 0;
> data_ptr.end = 10;
> data_ptr.step_size = 2;
> __cilkrts_cilk_for_32 (child_function, &data_ptr, (10-0)/2, grain);
> 
> Child_function (void *data_ptr, int high, int low)
> {
> 	for (xx = low; xx < high; xx++) 
> 	 {
> 		Tmp_var = (xx * data_ptr->step_size) + data_ptr->start;
> 		// Note: if the _Cilk_for was (ii = 9; ii >= 0; ii -= 2), we would have something like this:
> 		// Tmp_var = data_ptr->end - (xx * data_ptr->step_size)
> 		// The for-loop above won't change.  
> 		Array[Tmp_var] = 5;
> 	}
> }

This isn't really much different from
#pragma omp parallel for schedule(runtime, N)
(i.e. the combined construct), when it is combined, we also don't emit a
call to GOMP_parallel but to some other function to which we pass the
number of iterations and chunk size (== grain in Cilk+ terminology), the
only (minor) difference is that for OpenMP when you handle the whole low ...
high range the child function doesn't exit, but calls a function to give it
next pari of low/high and only when that function tells it there is no
further work to do, it returns.  But, the Cilk+ case is clearly the same
thing with just implicit telling there is no further work in the current
function.

So, I'd strongly prefer if you swap the parallel with Cilk_for, just set
the flag that the two are combined like OpenMP already has for tons of
constructs, and during expansion you just treat it together.

	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH] _Cilk_for for C and C++
  2014-01-24 19:41                                               ` Jakub Jelinek
@ 2014-01-24 20:33                                                 ` Iyer, Balaji V
  0 siblings, 0 replies; 42+ messages in thread
From: Iyer, Balaji V @ 2014-01-24 20:33 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, 'Jeff Law', 'Aldy Hernandez',
	'gcc-patches@gcc.gnu.org', 'rth@redhat.com'



> -----Original Message-----
> From: Jakub Jelinek [mailto:jakub@redhat.com]
> Sent: Friday, January 24, 2014 2:42 PM
> To: Iyer, Balaji V
> Cc: Jason Merrill; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'rth@redhat.com'
> Subject: Re: [PATCH] _Cilk_for for C and C++
> 
> On Thu, Jan 23, 2014 at 04:38:53PM +0000, Iyer, Balaji V wrote:
> > 	This is how I started to think of it at first, but then when I thought
> about it ... in _Cilk_for unlike the #pragma simd's for, the for statement - not
> the body - (e.g. "_Cilk_for (int ii = 0; ii < 10; ii++") doesn't really do anything
> nor does it belong in the child function. It is really mostly used to calculate the
> loop count and capture step-size and starting point.
> >
> > 	The child function has its own loop that will have a step size of 1
> regardless of your step size. You use the step-size to find the correct spot.
> Let me give you an example:
> >
> > _Cilk_for (int ii = 0; ii < 10; ii = ii  + 2) {
> > 	Array [ii] = 5;
> > }
> >
> > This is translated to the following (assume grain is something that the user
> input):
> >
> > data_ptr.start = 0;
> > data_ptr.end = 10;
> > data_ptr.step_size = 2;
> > __cilkrts_cilk_for_32 (child_function, &data_ptr, (10-0)/2, grain);
> >
> > Child_function (void *data_ptr, int high, int low) {
> > 	for (xx = low; xx < high; xx++)
> > 	 {
> > 		Tmp_var = (xx * data_ptr->step_size) + data_ptr->start;
> > 		// Note: if the _Cilk_for was (ii = 9; ii >= 0; ii -= 2), we would
> have something like this:
> > 		// Tmp_var = data_ptr->end - (xx * data_ptr->step_size)
> > 		// The for-loop above won't change.
> > 		Array[Tmp_var] = 5;
> > 	}
> > }
> 
> This isn't really much different from
> #pragma omp parallel for schedule(runtime, N) (i.e. the combined
> construct), when it is combined, we also don't emit a call to GOMP_parallel
> but to some other function to which we pass the number of iterations and
> chunk size (== grain in Cilk+ terminology), the only (minor) difference is that
> for OpenMP when you handle the whole low ...
> high range the child function doesn't exit, but calls a function to give it next
> pari of low/high and only when that function tells it there is no further work
> to do, it returns.  But, the Cilk+ case is clearly the same thing with just implicit
> telling there is no further work in the current function.
> 
> So, I'd strongly prefer if you swap the parallel with Cilk_for, just set the flag
> that the two are combined like OpenMP already has for tons of constructs,
> and during expansion you just treat it together.

Hi Jakub,
	What you are suggesting here would require a significant rewrite of the code. This version of _Cilk_for works and it does share significant amount of work with OMP routines as requested by other GCC developers. Given the time constraints, let's try to get this version accepted so that the feature will be available for the users and we will look into moving toward your suggestion when the phase 1 opens again.

Thanks,

Balaji V. Iyer.


> 
> 	Jakub

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2014-01-24 20:33 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-15 21:45 [PATCH] _Cilk_for for C and C++ Iyer, Balaji V
2013-11-16  1:38 ` Aldy Hernandez
2013-11-19  1:11   ` Iyer, Balaji V
2013-11-22 19:45     ` Jason Merrill
2013-11-26  8:16       ` Iyer, Balaji V
2013-11-27 17:59         ` Jason Merrill
2013-11-27 22:31           ` Jeff Law
2013-11-27 23:04             ` Iyer, Balaji V
2013-11-28  8:31               ` Jason Merrill
2013-12-03  6:30                 ` Jeff Law
2013-12-03 13:26                   ` Iyer, Balaji V
2013-12-03 13:40                     ` Jakub Jelinek
2013-12-03 14:01                       ` Iyer, Balaji V
2013-12-03 14:10                         ` Jakub Jelinek
2013-12-03 19:44                     ` Jeff Law
2013-12-16  0:40                     ` Iyer, Balaji V
2013-12-16 21:21                       ` Jason Merrill
2013-12-16 23:41                         ` Iyer, Balaji V
2013-12-18  0:22                         ` Iyer, Balaji V
2014-01-07 20:40                           ` Jason Merrill
2014-01-07 21:24                             ` Iyer, Balaji V
2014-01-07 21:29                               ` Jakub Jelinek
2014-01-07 22:12                                 ` Iyer, Balaji V
2014-01-08 17:31                                   ` Jakub Jelinek
2014-01-08 19:46                                     ` Iyer, Balaji V
2014-01-16 17:29                                       ` Jason Merrill
2014-01-16 17:39                                         ` Jakub Jelinek
2014-01-19  4:50                                         ` Iyer, Balaji V
2014-01-23 10:12                                           ` Jakub Jelinek
2014-01-23 16:38                                             ` Iyer, Balaji V
2014-01-24 19:41                                               ` Jakub Jelinek
2014-01-24 20:33                                                 ` Iyer, Balaji V
2014-01-24 19:28                                             ` Iyer, Balaji V
2014-01-16 21:19                                       ` Aldy Hernandez
2014-01-17  9:24                                         ` Marek Polacek
2014-01-19  4:53                                         ` Iyer, Balaji V
2014-01-06 22:29                         ` Iyer, Balaji V
2013-11-27 23:55     ` Jeff Law
2013-11-20  8:05 ` Aldy Hernandez
2013-11-27 18:37 ` Jason Merrill
2013-11-27 18:49   ` Jakub Jelinek
2013-11-27 19:04     ` Aldy Hernandez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).