public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RFC: simd enabled functions (omp declare simd / elementals)
@ 2013-11-01  3:05 Aldy Hernandez
  2013-11-01 10:57 ` Jakub Jelinek
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Aldy Hernandez @ 2013-11-01  3:05 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Henderson, Richard Biener, Jan Hubicka,
	Martin Jambor
  Cc: gcc-patches, Iyer, Balaji V

[-- Attachment #1: Type: text/plain, Size: 6246 bytes --]

Hello gentlemen.  I'm CCing all of you, because each of you can provide 
valuable feedback to various parts of the compiler which I touch.  I 
have sprinkled love notes with your names throughout the post :).

This is a patch against the gomp4 branch.  It provides initial support 
for simd-enabled functions which are "#pragma omp declare simd" in the 
OpenMP world and elementals in Cilk Plus nomenclature.  The parsing bits 
for OpenMP are already in trunk, but they are silently ignored.  This 
patch aims to remedy the situation.  The Cilk Plus parsing bits, OTOH, 
are not ready, but could trivially be adapted to use this infrastructure 
(see below).

I would like to at least get this into the gomp4 branch for now, because 
I am accumulating far too many changes locally.

The main idea is that for a simd annotated function, we can create one 
or more cloned vector variants of a scalar function that can later be 
used by the vectorizer.

For a simple example with multiple returns...

#pragma omp declare simd simdlen(4) notinbranch
int foo (int a, int b)
{
   if (a == b)
     return 555;
   else
     return 666;
}

...we would generate with this patch (unoptimized):

foo.simdclone.0 (vector(4) int simd.4, vector(4) int simd.5)
{
   unsigned int iter.6;
   int b.3[4];
   int a.2[4];
   int retval.1[4];
   int _3;
   int _5;
   int _6;
   vector(4) int _7;

   <bb 2>:
   a.2 = VIEW_CONVERT_EXPR<int[4]>(simd.4);
   b.3 = VIEW_CONVERT_EXPR<int[4]>(simd.5);
   iter.6_12 = 0;

   <bb 3>:
   # iter.6_9 = PHI <iter.6_12(2), iter.6_14(6)>
   _5 = a.2[iter.6_9];
   _6 = b.3[iter.6_9];
   if (_5 == _6)
     goto <bb 5>;
   else
     goto <bb 4>;

   <bb 4>:

   <bb 5>:
   # _3 = PHI <555(3), 666(4)>
   retval.1[iter.6_9] = _3;
   iter.6_14 = iter.6_9 + 1;
   if (iter.6_14 < 4)
     goto <bb 6>;
   else
     goto <bb 7>;

   <bb 6>:
   goto <bb 3>;

   <bb 7>:
   _7 = VIEW_CONVERT_EXPR<vector(4) int>(retval.1);
   return _7;

}

The new loop is properly created and annotated with 
loop->force_vect=true and loop->safelen set.

A possible use may be:

int array[1000];
void bar ()
{
   int i;
   for (i=0; i < 1000; ++i)
     array[i] = foo(i, 123);
}

In which case, we would use the simd clone if available:

bar ()
{
   vector(4) int vect_cst_.21;
   vector(4) int vect_i_6.20;
   vector(4) int * vectp_array.19;
   vector(4) int * vectp_array.18;
   vector(4) int vect_cst_.17;
   vector(4) int vect__4.16;
   vector(4) int vect_vec_iv_.15;
   vector(4) int vect_cst_.14;
   vector(4) int vect_cst_.13;
   int stmp_var_.12;
   int i;
   unsigned int ivtmp_1;
   int _4;
   unsigned int ivtmp_7;
   unsigned int ivtmp_20;
   unsigned int ivtmp_21;

   <bb 2>:
   vect_cst_.13_8 = { 0, 1, 2, 3 };
   vect_cst_.14_2 = { 4, 4, 4, 4 };
   vect_cst_.17_13 = { 123, 123, 123, 123 };
   vectp_array.19_15 = &array;
   vect_cst_.21_5 = { 1, 1, 1, 1 };
   goto <bb 4>;

   <bb 3>:

   <bb 4>:
   # i_9 = PHI <i_6(3), 0(2)>
   # ivtmp_1 = PHI <ivtmp_7(3), 1000(2)>
   # vect_vec_iv_.15_11 = PHI <vect_vec_iv_.15_12(3), vect_cst_.13_8(2)>
   # vectp_array.18_16 = PHI <vectp_array.18_17(3), vectp_array.19_15(2)>
   # ivtmp_20 = PHI <ivtmp_21(3), 0(2)>
   vect_vec_iv_.15_12 = vect_vec_iv_.15_11 + vect_cst_.14_2;
   vect__4.16_14 = foo.simdclone.0 (vect_vec_iv_.15_11, vect_cst_.17_13);
   _4 = 0;
   MEM[(int *)vectp_array.18_16] = vect__4.16_14;
   vect_i_6.20_19 = vect_vec_iv_.15_11 + vect_cst_.21_5;
   i_6 = i_9 + 1;
   ivtmp_7 = ivtmp_1 - 1;
   vectp_array.18_17 = vectp_array.18_16 + 16;
   ivtmp_21 = ivtmp_20 + 1;
   if (ivtmp_21 < 250)
     goto <bb 3>;
   else
     goto <bb 5>;

   <bb 5>:
   return;

}

That's the idea.

Some of the ABI issues still need to be resolved (mangling for avx-512, 
what to do with non x86 architectures, what (if any) default clones will 
be created when no vector length is specified, etc etc), but the main 
functionality can be seen above.

Uniform and linear parameters (which are passed as scalars) are still 
not handled.  Also, Jakub mentioned that with the current vectorizer we 
probably can't make good use of the inbranch/masked clones.  I have a 
laundry list of missing things prepended by // FIXME if anyone is curious.

I'd like some feedback from y'all in your respective areas, since this 
touches a few places besides OpenMP.  For instance...

[Honza] Where do you suggest I place a list of simd clones for a 
particular (scalar) function?  Right now I have added a simdclone_of 
field in cgraph_node and am (temporarily) serially scanning all 
functions in get_simd_clone().  This is obviously inefficient.  I didn't 
know whether to use the current next_sibling_clone/etc fields or create 
my own.  I tried using clone_of, and that caused some havoc so I'd like 
some feedback.

[Martin] I have adapted the ipa_parm_adjustment infrastructure to allow 
adding new arguments out of the blue like you mentioned was missing in 
ipa-prop.h.  I have also added support for creating vectors of 
arguments.  Could you take a look at my changes to ipa-prop.[ch]?

[Martin] I need to add new arguments in the case of inbranch clones, 
which add an additional vector with a mask as the last argument:  For 
the following:

#pragma omp declare simd simdlen(4) inbranch
int foo (int a)
{
   return a + 1234;
}

...we would generate a clone with:

vector(4) int
foo.simdclone.0 (vector(4) int simd.4, vector(4) int mask.5)

I thought it best to enhance ipa_modify_formal_parameters() and 
associated machinery than to add the new argument ad-hoc.  We already 
have enough ways of doing tree and cgraph versioning in the compiler ;-).

[Richi] I would appreciate feedback on the vectorizer and the 
infrastructure as a whole.  Do keep in mind that this is a work in 
progress :).

[Balaji] This patch would provide the infrastructure that can be used by 
the Cilk Plus elementals.  When this is complete, all that would be 
missing is the parser.  You would have to tag the original function with 
"omp declare simd" and "cilk plus elemental" attributes.  See 
simd_clone_clauses_extract.

[Jakub/rth]: As usual, valuable feedback on OpenMP and everything else 
is greatly appreciated.

Oh yeah, there are many more changes that would ideally be needed in the 
vectorizer.

Fire away!

[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 56932 bytes --]

gcc/ChangeLog.elementals

	* Makefile.in (omp-low.o): Depend on PRETTY_PRINT_H and IPA_PROP_H.
	* tree-vect-stmts.c (vectorizable_call): Allow > 3 arguments when
	a SIMD clone may be available.
	(vectorizable_function): Use SIMD clone if available.
	* ipa-cp.c (determine_versionability): Nodes with SIMD clones are
	not versionable.
	* ggc.h (ggc_alloc_cleared_simd_clone_stat): New.
	* cgraph.h (enum linear_stride_type): New.
	(struct simd_clone_arg): New.
	(struct simd_clone): New.
	(struct cgraph_node): Add simdclone and simdclone_of fields.
	(get_simd_clone): Protoize.
	* cgraph.c (get_simd_clone): New.
	Add `has_simd_clones' field.
	* ipa-cp.c (determine_versionability): Disallow functions with
	simd clones.
	* ipa-prop.h (ipa_sra_modify_function_body): Protoize.
	(sra_ipa_modify_expr): Same.
	(struct ipa_parm_adjustment): Add new_arg_prefix and new_param
	fields.  Document their use.
	* ipa-prop.c (ipa_modify_formal_parameters): Handle creating brand
	new parameters and minor cleanups.
	* omp-low.c: Add new pass_omp_simd_clone support code.
	(make_pass_omp_simd_clone): New.
	(pass_data_omp_simd_clone): Declare.
	(class pass_omp_simd_clone): Declare.
	(vecsize_mangle): New.
	(ipa_omp_simd_clone): New.
	(simd_clone_clauses_extract): New.
	(simd_clone_compute_base_data_type): New.
	(simd_clone_compute_vecsize_and_simdlen): New.
	(simd_clone_create): New.
	(simd_clone_adjust_return_type): New.
	(simd_clone_adjust_return_types): New.
	(simd_clone_adjust): New.
	(simd_clone_init_simd_arrays): New.
	(ipa_simd_modify_function_body): New.
	(simd_clone_mangle): New.
	(simd_clone_struct_alloc): New.
	(simd_clone_struct_copy): New.
	(class argno_map): New.
	(argno_map::argno_map(tree)): New.
	(argno_map::~argno_map): New.
	(argno_map::operator []): New.
	(argno_map::length): New.
	(expand_simd_clones): New.
	(create_tmp_simd_array): New.
	* tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): New.
	* tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document.
	* tree-pass.h (make_pass_omp_simd_clone): New.
	* passes.def (pass_omp_simd_clone): New.
	* target.def: Define new hook prefix "TARGET_CILKPLUS_".
	(default_vecsize_mangle): New.
	(vecsize_for_mangle): New.
	* doc/tm.texi.in: Add placeholder for
	TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE and
	TARGET_CILKPLUS_VECSIZE_FOR_MANGLE.
	* tree-sra.c (sra_ipa_modify_expr): Remove static modifier.
	(ipa_sra_modify_function_body): Same.
	* tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define.
	* doc/tm.texi: Regenerate.
	* config/i386/i386.c (ix86_cilkplus_default_vecsize_mangle): New.
	(ix86_cilkplus_vecsize_for_mangle): New.
	(TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE): New.
	(TARGET_CILKPLUS_VECSIZE_FOR_MANGLE): New.

index 0000000..3f28f42
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 124ee0a..561527f 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2998,4 +2998,29 @@ cgraph_get_body (struct cgraph_node *node)
   return true;
 }
 
+/* Given a NODE, return a compatible SIMD clone returning `vectype'.
+   If none found, NULL is returned.  */
+
+struct cgraph_node *
+get_simd_clone (struct cgraph_node *node, tree vectype)
+{
+  if (!node->has_simd_clones)
+    return NULL;
+
+  /* FIXME: What to do with linear/uniform arguments.  */
+
+  /* FIXME: Nasty kludge until we figure out where to put the clone
+     list-- perhaps, next_sibling_clone/prev_sibling_clone in
+     cgraph_node ??.  */
+  struct cgraph_node *t;
+  FOR_EACH_FUNCTION (t)
+    if (t->simdclone_of == node
+	/* No inbranch vectorization for now.  */
+	&& !t->simdclone->inbranch
+	&& types_compatible_p (TREE_TYPE (TREE_TYPE (t->symbol.decl)),
+			       vectype))
+      break;
+  return t;
+}
+
 #include "gt-cgraph.h"
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index afdeaba..c8d1830 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -248,6 +248,91 @@ struct GTY(()) cgraph_clone_info
   bitmap combined_args_to_skip;
 };
 
+enum linear_stride_type {
+  LINEAR_STRIDE_NO,
+  LINEAR_STRIDE_YES_CONSTANT,
+  LINEAR_STRIDE_YES_VARIABLE
+};
+
+/* Function arguments in the original function of a SIMD clone.
+   Supplementary data for `struct simd_clone'.  */
+
+struct GTY(()) simd_clone_arg {
+  /* Original function argument as it orignally existed in
+     DECL_ARGUMENTS.  */
+  tree orig_arg;
+
+  /* If argument is a vector, this holds the vector version of
+     orig_arg that after adjusting the argument types will live in
+     DECL_ARGUMENTS.  Otherwise, this is NULL.
+
+     This basically holds:
+       vector(simdlen) __typeof__(orig_arg) new_arg.  */
+  tree vector_arg;
+
+  /* If argument is a vector, this holds the array where the simd
+     argument is held while executing the simd clone function.  This
+     is a local variable in the cloned function.  Its content is
+     copied from vector_arg upon entry to the clone.
+
+     This basically holds:
+       __typeof__(orig_arg) simd_array[simdlen].  */
+  tree simd_array;
+
+  /* A SIMD clone's argument can be either linear (constant or
+     variable), uniform, or vector.  If the argument is neither linear
+     or uniform, the default is vector.  */
+
+  /* If the linear stride is a constant, `linear_stride' is
+     LINEAR_STRIDE_YES_CONSTANT, and `linear_stride_num' holds
+     the numeric stride.
+
+     If the linear stride is variable, `linear_stride' is
+     LINEAR_STRIDE_YES_VARIABLE, and `linear_stride_num' contains
+     the function argument containing the stride (as an index into the
+     function arguments starting at 0).
+
+     Otherwise, `linear_stride' is LINEAR_STRIDE_NO and
+     `linear_stride_num' is unused.  */
+  enum linear_stride_type linear_stride;
+  unsigned HOST_WIDE_INT linear_stride_num;
+
+  /* Variable alignment if available, otherwise 0.  */
+  unsigned int alignment;
+
+  /* True if variable is uniform.  */
+  unsigned int uniform : 1;
+};
+
+/* Specific data for a SIMD function clone.  */
+
+struct GTY(()) simd_clone {
+  /* Number of words in the SIMD lane associated with this clone.  */
+  unsigned int simdlen;
+
+  /* Number of annotated function arguments in `args'.  This is
+     usually the number of named arguments in FNDECL.  */
+  unsigned int nargs;
+
+  /* Max hardware vector size in bits.  */
+  unsigned int hw_vector_size;
+
+  /* The mangling character for a given vector size.  This is is used
+     to determine the ISA mangling bit as specified in the Intel
+     Vector ABI.  */
+  unsigned char vecsize_mangle;
+
+  /* True if this is the masked, in-branch version of the clone,
+     otherwise false.  */
+  unsigned int inbranch : 1;
+
+  /* True if this is a Cilk Plus variant.  */
+  unsigned int cilk_elemental : 1;
+
+  /* Annotated function arguments for the original function.  */
+  struct simd_clone_arg GTY((length ("%h.nargs"))) args[1];
+};
+
 
 /* The cgraph data structure.
    Each function decl has assigned cgraph_node listing callees and callers.  */
@@ -282,6 +367,14 @@ struct GTY(()) cgraph_node {
   /* Declaration node used to be clone of. */
   tree former_clone_of;
 
+  /* If this is a SIMD clone, this points to the SIMD specific
+     information for it.  */
+  struct simd_clone *simdclone;
+
+  /* If this is a SIMD clone, this points to the original scalar
+     function.  */
+  struct cgraph_node *simdclone_of;
+
   /* Interprocedural passes scheduled to have their transform functions
      applied next time we execute local pass on them.  We maintain it
      per-function in order to allow IPA passes to introduce new functions.  */
@@ -323,6 +416,8 @@ struct GTY(()) cgraph_node {
   /* ?? We should be able to remove this.  We have enough bits in
      cgraph to calculate it.  */
   unsigned tm_clone : 1;
+  /* True if this function has SIMD clones.  */
+  unsigned has_simd_clones : 1;
   /* True if this decl is a dispatcher for function versions.  */
   unsigned dispatcher_function : 1;
 };
@@ -742,6 +837,7 @@ void cgraph_speculative_call_info (struct cgraph_edge *,
 				   struct cgraph_edge *&,
 				   struct cgraph_edge *&,
 				   struct ipa_ref *&);
+struct cgraph_node *get_simd_clone (struct cgraph_node *, tree);
 
 /* In cgraphunit.c  */
 struct asm_node *add_asm_node (tree);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 168a2ac..73140f9 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -42875,6 +42875,42 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)
   return val;
 }
 
+/* Return the default mangling character when no vector size can be
+   determined from the `processor' clause.  */
+
+static char
+ix86_cilkplus_default_vecsize_mangle (struct cgraph_node *clone
+				      ATTRIBUTE_UNUSED)
+{
+  return 'x';
+}
+
+/* Return the hardware vector size (in bits) for a mangling
+   character.  */
+
+static unsigned int
+ix86_cilkplus_vecsize_for_mangle (char mangle)
+{
+  /* ?? Intel currently has no ISA encoding character for AVX-512.  */
+  switch (mangle)
+    {
+    case 'x':
+      /* xmm (SSE2).  */
+      return 128;
+    case 'y':
+      /* ymm1 (AVX1).  */
+    case 'Y':
+      /* ymm2 (AVX2).  */
+      return 256;
+    case 'z':
+      /* zmm (MIC).  */
+      return 512;
+    default:
+      gcc_unreachable ();
+      return 0;
+    }
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_RETURN_IN_MEMORY
 #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
@@ -43247,6 +43283,14 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)
 #undef TARGET_SPILL_CLASS
 #define TARGET_SPILL_CLASS ix86_spill_class
 
+#undef TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE
+#define TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE \
+  ix86_cilkplus_default_vecsize_mangle
+
+#undef TARGET_CILKPLUS_VECSIZE_FOR_MANGLE
+#define TARGET_CILKPLUS_VECSIZE_FOR_MANGLE \
+  ix86_cilkplus_vecsize_for_mangle
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 #include "gt-i386.h"
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8d220f3..8bb9d1e 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5787,6 +5787,26 @@ The default is @code{NULL_TREE} which means to not vectorize gather
 loads.
 @end deftypefn
 
+@deftypefn {Target Hook} char TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE (struct cgraph_node *@var{})
+This hook should return the default mangling character when no vector
+size can be determined by examining the  Cilk Plus @code{processor} clause.
+This is as specified in the Intel Vector ABI document.
+
+This hook, as well as @code{max_vector_size_for_isa} below must be set
+to support the Cilk Plus @code{processor} clause.
+
+The only argument is a @var{cgraph_node} containing the clone.
+@end deftypefn
+
+@deftypefn {Target Hook} {unsigned int} TARGET_CILKPLUS_VECSIZE_FOR_MANGLE (char)
+This hook returns the maximum hardware vector size in bits for a given
+mangling character.  The character is as described in Intel's
+Vector ABI (see @var{ISA} character in the section on mangling).
+
+This hook must be defined in order to support the Cilk Plus @code{processor}
+clause.
+@end deftypefn
+
 @node Anchored Addresses
 @section Anchored Addresses
 @cindex anchored addresses
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 863e843a..db25787 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4414,6 +4414,10 @@ address;  but often a machine-dependent strategy can generate better code.
 
 @hook TARGET_VECTORIZE_BUILTIN_GATHER
 
+@hook TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE
+
+@hook TARGET_CILKPLUS_VECSIZE_FOR_MANGLE
+
 @node Anchored Addresses
 @section Anchored Addresses
 @cindex anchored addresses
diff --git a/gcc/ggc.h b/gcc/ggc.h
index b31bc80..eee90c6 100644
--- a/gcc/ggc.h
+++ b/gcc/ggc.h
@@ -276,4 +276,11 @@ ggc_alloc_cleared_gimple_statement_d_stat (size_t s MEM_STAT_DECL)
     ggc_internal_cleared_alloc_stat (s PASS_MEM_STAT);
 }
 
+static inline struct simd_clone *
+ggc_alloc_cleared_simd_clone_stat (size_t s MEM_STAT_DECL)
+{
+  return (struct simd_clone *)
+    ggc_internal_cleared_alloc_stat (s PASS_MEM_STAT);
+}
+
 #endif
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index c38ba82..faae080 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -446,6 +446,13 @@ determine_versionability (struct cgraph_node *node)
     reason = "not a tree_versionable_function";
   else if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
     reason = "insufficient body availability";
+  else if (node->has_simd_clones)
+    {
+      /* Ideally we should clone the SIMD clones themselves and create
+	 vector copies of them, so IPA-cp and SIMD clones can happily
+	 coexist, but that may not be worth the effort.  */
+      reason = "function has SIMD clones";
+    }
 
   if (reason && dump_file && !node->symbol.alias && !node->thunk.thunk_p)
     fprintf (dump_file, "Function %s/%i is not versionable, reason: %s.\n",
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 2fbc9d4..0c20dc6 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -3361,24 +3361,18 @@ void
 ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
 			      const char *synth_parm_prefix)
 {
-  vec<tree> oparms, otypes;
-  tree orig_type, new_type = NULL;
-  tree old_arg_types, t, new_arg_types = NULL;
-  tree parm, *link = &DECL_ARGUMENTS (fndecl);
-  int i, len = adjustments.length ();
-  tree new_reversed = NULL;
-  bool care_for_types, last_parm_void;
-
   if (!synth_parm_prefix)
     synth_parm_prefix = "SYNTH";
 
-  oparms = ipa_get_vector_of_formal_parms (fndecl);
-  orig_type = TREE_TYPE (fndecl);
-  old_arg_types = TYPE_ARG_TYPES (orig_type);
+  vec<tree> oparms = ipa_get_vector_of_formal_parms (fndecl);
+  tree orig_type = TREE_TYPE (fndecl);
+  tree old_arg_types = TYPE_ARG_TYPES (orig_type);
 
   /* The following test is an ugly hack, some functions simply don't have any
      arguments in their type.  This is probably a bug but well... */
-  care_for_types = (old_arg_types != NULL_TREE);
+  bool care_for_types = (old_arg_types != NULL_TREE);
+  bool last_parm_void;
+  vec<tree> otypes;
   if (care_for_types)
     {
       last_parm_void = (TREE_VALUE (tree_last (old_arg_types))
@@ -3395,13 +3389,20 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
       otypes.create (0);
     }
 
-  for (i = 0; i < len; i++)
+  int len = adjustments.length ();
+  tree *link = &DECL_ARGUMENTS (fndecl);
+  tree new_arg_types = NULL;
+  for (int i = 0; i < len; i++)
     {
       struct ipa_parm_adjustment *adj;
       gcc_assert (link);
 
       adj = &adjustments[i];
-      parm = oparms[adj->base_index];
+      tree parm;
+      if (adj->new_param)
+	parm = NULL;
+      else
+	parm = oparms[adj->base_index];
       adj->base = parm;
 
       if (adj->copy_param)
@@ -3417,8 +3418,18 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
 	  tree new_parm;
 	  tree ptype;
 
-	  if (adj->by_ref)
-	    ptype = build_pointer_type (adj->type);
+	  if (adj->simdlen)
+	    {
+	      /* If we have a non-null simdlen but by_ref is true, we
+		 want a vector of pointers.  Build the vector of
+		 pointers here, not a pointer to a vector in the
+		 adj->by_ref case below.  */
+	      ptype = build_vector_type (adj->type, adj->simdlen);
+	    }
+	  else if (adj->by_ref)
+	    {
+	      ptype = build_pointer_type (adj->type);
+	    }
 	  else
 	    ptype = adj->type;
 
@@ -3427,8 +3438,9 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
 
 	  new_parm = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL_TREE,
 				 ptype);
-	  DECL_NAME (new_parm) = create_tmp_var_name (synth_parm_prefix);
-
+	  const char *prefix
+	    = adj->new_param ? adj->new_arg_prefix : synth_parm_prefix;
+	  DECL_NAME (new_parm) = create_tmp_var_name (prefix);
 	  DECL_ARTIFICIAL (new_parm) = 1;
 	  DECL_ARG_TYPE (new_parm) = ptype;
 	  DECL_CONTEXT (new_parm) = fndecl;
@@ -3436,17 +3448,20 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
 	  DECL_IGNORED_P (new_parm) = 1;
 	  layout_decl (new_parm, 0);
 
-	  adj->base = parm;
+	  if (adj->new_param)
+	    adj->base = new_parm;
+	  else
+	    adj->base = parm;
 	  adj->reduction = new_parm;
 
 	  *link = new_parm;
-
 	  link = &DECL_CHAIN (new_parm);
 	}
     }
 
   *link = NULL_TREE;
 
+  tree new_reversed = NULL;
   if (care_for_types)
     {
       new_reversed = nreverse (new_arg_types);
@@ -3464,6 +3479,7 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
      Exception is METHOD_TYPEs must have THIS argument.
      When we are asked to remove it, we need to build new FUNCTION_TYPE
      instead.  */
+  tree new_type = NULL;
   if (TREE_CODE (orig_type) != METHOD_TYPE
        || (adjustments[0].copy_param
 	  && adjustments[0].base_index == 0))
@@ -3489,7 +3505,7 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
 
   /* This is a new type, not a copy of an old type.  Need to reassociate
      variants.  We can handle everything except the main variant lazily.  */
-  t = TYPE_MAIN_VARIANT (orig_type);
+  tree t = TYPE_MAIN_VARIANT (orig_type);
   if (orig_type != t)
     {
       TYPE_MAIN_VARIANT (new_type) = t;
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 48634d2..8d7d9b9 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -634,9 +634,10 @@ struct ipa_parm_adjustment
      arguments.  */
   tree alias_ptr_type;
 
-  /* The new declaration when creating/replacing a parameter.  Created by
-     ipa_modify_formal_parameters, useful for functions modifying the body
-     accordingly. */
+  /* The new declaration when creating/replacing a parameter.  Created
+     by ipa_modify_formal_parameters, useful for functions modifying
+     the body accordingly.  For brand new arguments, this is the newly
+     created argument.  */
   tree reduction;
 
   /* New declaration of a substitute variable that we may use to replace all
@@ -647,15 +648,36 @@ struct ipa_parm_adjustment
      is NULL), this is going to be its nonlocalized vars value.  */
   tree nonlocal_value;
 
+  /* If this is a brand new argument, this holds the prefix to be used
+     for the DECL_NAME.  */
+  const char *new_arg_prefix;
+
   /* Offset into the original parameter (for the cases when the new parameter
      is a component of an original one).  */
   HOST_WIDE_INT offset;
 
-  /* Zero based index of the original parameter this one is based on.  (ATM
-     there is no way to insert a new parameter out of the blue because there is
-     no need but if it arises the code can be easily exteded to do so.)  */
+  /* Zero based index of the original parameter this one is based on.  */
   int base_index;
 
+  /* If non-null, the parameter is a vector of `type' with this many
+     elements.  */
+  int simdlen;
+
+  /* This is a brand new parameter.
+
+     For new parameters, base_index must be >= the number of
+     DECL_ARGUMENTS in the function.  That is, new arguments will be
+     the last arguments in the adjusted function.
+
+     ?? Perhaps we could redesign ipa_modify_formal_parameters() to
+     reorganize argument position, thus allowing inserting of brand
+     new arguments anywhere, but there is no use for this now.
+
+     Also, `type' should be set to the new type, `new_arg_prefix'
+     should be set to the string prefix for the new DECL_NAME, and
+     `reduction' will ultimately hold the newly created argument.  */
+  unsigned new_param : 1;
+
   /* This new parameter is an unmodified parameter at index base_index. */
   unsigned copy_param : 1;
 
@@ -697,5 +719,7 @@ void ipa_dump_param (FILE *, struct ipa_node_params *info, int i);
 /* From tree-sra.c:  */
 tree build_ref_for_offset (location_t, tree, HOST_WIDE_INT, tree,
 			   gimple_stmt_iterator *, bool);
+bool ipa_sra_modify_function_body (ipa_parm_adjustment_vec);
+bool sra_ipa_modify_expr (tree *, bool, ipa_parm_adjustment_vec);
 
 #endif /* IPA_PROP_H */
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 26f0c35..afca595 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "optabs.h"
 #include "cfgloop.h"
 #include "target.h"
+#include "pretty-print.h"
+#include "ipa-prop.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -10380,5 +10382,885 @@ make_pass_diagnose_omp_blocks (gcc::context *ctxt)
 {
   return new pass_diagnose_omp_blocks (ctxt);
 }
+\f
+/* SIMD clone supporting code.  */
+
+/* A map for function arguments.  This will map a zero-based integer
+   to the corresponding index into DECL_ARGUMENTS.  */
+class argno_map
+{
+  vec<tree> tree_args;
+ public:
+  /* Default constructor declared but not implemented by design.  The
+     only valid constructor is the TREE version below.  */
+  argno_map ();
+  argno_map (tree fndecl);
+
+  ~argno_map () { tree_args.release (); }
+  unsigned int length () { return tree_args.length (); }
+  tree operator[] (unsigned n) { return tree_args[n]; }
+};
+
+/* FNDECL is the function containing the arguments.  */
+
+argno_map::argno_map (tree fndecl)
+{
+  tree_args.create (5);
+  for (tree t = DECL_ARGUMENTS (fndecl); t; t = DECL_CHAIN (t))
+    tree_args.safe_push (t);
+}
+
+/* Allocate a fresh `simd_clone' and return it.  NARGS is the number
+   of arguments to reserve space for.  */
+
+static struct simd_clone *
+simd_clone_struct_alloc (int nargs)
+{
+  struct simd_clone *clone_info;
+  int len = sizeof (struct simd_clone)
+    + nargs * sizeof (struct simd_clone_arg);
+  clone_info = ggc_alloc_cleared_simd_clone_stat (len PASS_MEM_STAT);
+  return clone_info;
+}
+
+/* Make a copy of the `struct simd_clone' in FROM to TO.  */
+
+static inline void
+simd_clone_struct_copy (struct simd_clone *to, struct simd_clone *from)
+{
+  memcpy (to, from, sizeof (struct simd_clone)
+	  + from->nargs * sizeof (struct simd_clone_arg));
+}
+
+/* Given a simd clone in NEW_NODE, extract the simd specific
+   information from the OMP clauses passed in CLAUSES, and set the
+   relevant bits in the cgraph node.  *INBRANCH_SPECIFIED is set to
+   TRUE if the `inbranch' or `notinbranch' clause specified, otherwise
+   set to FALSE.  */
+
+static void
+simd_clone_clauses_extract (struct cgraph_node *new_node, tree clauses,
+			    bool *inbranch_specified)
+{
+  tree t;
+  int n = 0;
+  *inbranch_specified = false;
+  for (t = DECL_ARGUMENTS (new_node->symbol.decl); t; t = DECL_CHAIN (t))
+    ++n;
+
+  /* To distinguish from an OpenMP simd clone, Cilk Plus functions to
+     be cloned have a distinctive artificial label in addition to "omp
+     declare simd".  */
+  bool cilk_clone
+    = (flag_enable_cilkplus
+       && lookup_attribute ("cilk plus elemental",
+			    DECL_ATTRIBUTES (new_node->symbol.decl)));
+
+  /* Allocate one more than needed just in case this is an in-branch
+     clone which will require a mask argument.  */
+  struct simd_clone *clone_info = simd_clone_struct_alloc (n + 1);
+  clone_info->nargs = n;
+  clone_info->cilk_elemental = cilk_clone;
+  gcc_assert (!new_node->simdclone);
+  new_node->simdclone = clone_info;
+
+  if (!clauses)
+    return;
+  clauses = TREE_VALUE (clauses);
+  if (!clauses || TREE_CODE (clauses) != OMP_CLAUSE)
+    return;
+
+  for (t = clauses; t; t = OMP_CLAUSE_CHAIN (t))
+    {
+      switch (OMP_CLAUSE_CODE (t))
+	{
+	case OMP_CLAUSE_INBRANCH:
+	  clone_info->inbranch = 1;
+	  *inbranch_specified = true;
+	  break;
+	case OMP_CLAUSE_NOTINBRANCH:
+	  clone_info->inbranch = 0;
+	  *inbranch_specified = true;
+	  break;
+	case OMP_CLAUSE_SIMDLEN:
+	  clone_info->simdlen
+	    = TREE_INT_CST_LOW (OMP_CLAUSE_SIMDLEN_EXPR (t));
+	  break;
+	case OMP_CLAUSE_LINEAR:
+	  {
+	    tree decl = OMP_CLAUSE_DECL (t);
+	    tree step = OMP_CLAUSE_LINEAR_STEP (t);
+	    int argno = TREE_INT_CST_LOW (decl);
+	    if (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE (t))
+	      {
+		clone_info->args[argno].linear_stride
+		  = LINEAR_STRIDE_YES_VARIABLE;
+		clone_info->args[argno].linear_stride_num
+		  = TREE_INT_CST_LOW (step);
+		gcc_assert (!TREE_INT_CST_HIGH (step));
+	      }
+	    else
+	      {
+		if (TREE_INT_CST_HIGH (step))
+		  {
+		    /* It looks like this can't really happen, since the
+		       front-ends generally issue:
+
+		       warning: integer constant is too large for its type.
+
+		       But let's assume somehow we got past all that.  */
+		    warning_at (DECL_SOURCE_LOCATION (decl), 0,
+				"ignoring large linear step");
+		  }
+		else
+		  {
+		    clone_info->args[argno].linear_stride
+		      = LINEAR_STRIDE_YES_CONSTANT;
+		    clone_info->args[argno].linear_stride_num
+		      = TREE_INT_CST_LOW (step);
+		  }
+	      }
+	    break;
+	  }
+	case OMP_CLAUSE_UNIFORM:
+	  {
+	    tree decl = OMP_CLAUSE_DECL (t);
+	    int argno = tree_low_cst (decl, 1);
+	    clone_info->args[argno].uniform = 1;
+	    break;
+	  }
+	case OMP_CLAUSE_ALIGNED:
+	  {
+	    tree decl = OMP_CLAUSE_DECL (t);
+	    int argno = tree_low_cst (decl, 1);
+	    clone_info->args[argno].alignment
+	      = TREE_INT_CST_LOW (OMP_CLAUSE_ALIGNED_ALIGNMENT (t));
+	    break;
+	  }
+	default:
+	  break;
+	}
+    }
+}
+
+/* Helper function for mangling vectors.  Given a vector size in bits,
+   return the corresponding mangling character.  */
+
+static char
+vecsize_mangle (unsigned int vecsize)
+{
+  switch (vecsize)
+    {
+      /* The Intel Vector ABI does not provide a mangling character
+	 for a 64-bit ISA, but this feels like it's keeping with the
+	 design.  */
+    case 64: return 'w';
+
+    case 128: return 'x';
+    case 256: return 'y';
+    case 512: return 'z';
+    default:
+      /* FIXME: We must come up with a default mangling bit.  */
+      return 'x';
+    }
+}
+
+/* Given a SIMD clone in NEW_NODE, calculate the characteristic data
+   type and return the coresponding type.  The characteristic data
+   type is computed as described in the Intel Vector ABI.  */
+
+static tree
+simd_clone_compute_base_data_type (struct cgraph_node *new_node)
+{
+  tree type = integer_type_node;
+  tree fndecl = new_node->symbol.decl;
+
+  /* a) For non-void function, the characteristic data type is the
+        return type.  */
+  if (TREE_CODE (TREE_TYPE (TREE_TYPE (fndecl))) != VOID_TYPE)
+    type = TREE_TYPE (TREE_TYPE (fndecl));
+
+  /* b) If the function has any non-uniform, non-linear parameters,
+        then the characteristic data type is the type of the first
+        such parameter.  */
+  else
+    {
+      argno_map map (fndecl);
+      for (unsigned int i = 0; i < new_node->simdclone->nargs; ++i)
+	{
+	  struct simd_clone_arg arg = new_node->simdclone->args[i];
+	  if (!arg.uniform && arg.linear_stride == LINEAR_STRIDE_NO)
+	    {
+	      type = TREE_TYPE (map[i]);
+	      break;
+	    }
+	}
+    }
+
+  /* c) If the characteristic data type determined by a) or b) above
+        is struct, union, or class type which is pass-by-value (except
+        for the type that maps to the built-in complex data type), the
+        characteristic data type is int.  */
+  if (RECORD_OR_UNION_TYPE_P (type)
+      && !aggregate_value_p (type, NULL)
+      && TREE_CODE (type) != COMPLEX_TYPE)
+    return integer_type_node;
+
+  /* d) If none of the above three classes is applicable, the
+        characteristic data type is int.  */
+
+  return type;
+
+  /* e) For Intel Xeon Phi native and offload compilation, if the
+        resulting characteristic data type is 8-bit or 16-bit integer
+        data type, the characteristic data type is int.  */
+  /* Well, we don't handle Xeon Phi yet.  */
+}
+
+/* Given a SIMD clone in NEW_NODE, compute simdlen and vector size,
+   and store them in NEW_NODE->simdclone.  */
+
+static void
+simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *new_node)
+{
+  char vmangle = new_node->simdclone->vecsize_mangle;
+  /* Vector size for this clone.  */
+  unsigned int vecsize = 0;
+  /* Base vector type, based on function arguments.  */
+  tree base_type = simd_clone_compute_base_data_type (new_node);
+  unsigned int base_type_size = GET_MODE_BITSIZE (TYPE_MODE (base_type));
+
+  /* Calculate everything for Cilk Plus clones with appropriate target
+     support.  This is as specified in the Intel Vector ABI.
+
+     Note: Any target which supports the Cilk Plus processor clause
+     must also provide appropriate target hooks for calculating
+     default ISA/processor (default_vecsize_mangle), and for
+     calculating hardware vector size based on ISA/processor
+     (vecsize_for_mangle).  */
+  if (new_node->simdclone->cilk_elemental
+      && targetm.cilkplus.default_vecsize_mangle)
+    {
+      if (!vmangle)
+	vmangle = targetm.cilkplus.default_vecsize_mangle (new_node);
+      vecsize = targetm.cilkplus.vecsize_for_mangle (vmangle);
+      if (!new_node->simdclone->simdlen)
+	new_node->simdclone->simdlen = vecsize / base_type_size;
+    }
+  /* Calculate everything else generically.  */
+  else
+    {
+      vecsize = GET_MODE_BITSIZE (targetm.vectorize.preferred_simd_mode
+				  (TYPE_MODE (base_type)));
+      vmangle = vecsize_mangle (vecsize);
+      if (!new_node->simdclone->simdlen)
+	new_node->simdclone->simdlen = vecsize / base_type_size;
+    }
+  new_node->simdclone->vecsize_mangle = vmangle;
+  new_node->simdclone->hw_vector_size = vecsize;
+}
+
+static void
+simd_clone_mangle (struct cgraph_node *old_node, struct cgraph_node *new_node)
+{
+  char vecsize_mangle = new_node->simdclone->vecsize_mangle;
+  char mask = new_node->simdclone->inbranch ? 'M' : 'N';
+  unsigned int simdlen = new_node->simdclone->simdlen;
+  unsigned int n;
+  pretty_printer pp;
+
+  gcc_assert (vecsize_mangle && simdlen);
+
+  pp_string (&pp, "_ZGV");
+  pp_character (&pp, vecsize_mangle);
+  pp_character (&pp, mask);
+  pp_decimal_int (&pp, simdlen);
+
+  for (n = 0; n < new_node->simdclone->nargs; ++n)
+    {
+      struct simd_clone_arg arg = new_node->simdclone->args[n];
+
+      if (arg.uniform)
+	pp_character (&pp, 'u');
+      else if (arg.linear_stride == LINEAR_STRIDE_YES_CONSTANT)
+	{
+	  gcc_assert (arg.linear_stride_num != 0);
+	  pp_character (&pp, 'l');
+	  if (arg.linear_stride_num > 1)
+	    pp_unsigned_wide_integer (&pp,
+				      arg.linear_stride_num);
+	}
+      else if (arg.linear_stride == LINEAR_STRIDE_YES_VARIABLE)
+	{
+	  pp_character (&pp, 's');
+	  pp_unsigned_wide_integer (&pp, arg.linear_stride_num);
+	}
+      else
+	pp_character (&pp, 'v');
+      if (arg.alignment)
+	{
+	  pp_character (&pp, 'a');
+	  pp_decimal_int (&pp, arg.alignment);
+	}
+    }
+
+  pp_underscore (&pp);
+  pp_string (&pp,
+	     IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (old_node->symbol.decl)));
+  const char *str = pp_formatted_text (&pp);
+  change_decl_assembler_name (new_node->symbol.decl,
+			      get_identifier (str));
+}
+
+/* Create a simd clone of OLD_NODE and return it.  */
+
+static struct cgraph_node *
+simd_clone_create (struct cgraph_node *old_node)
+{
+  struct cgraph_node *new_node;
+  new_node = cgraph_function_versioning (old_node, vNULL, NULL, NULL, false,
+					 NULL, NULL, "simdclone");
+
+  new_node->simdclone_of = old_node;
+
+  /* Keep cgraph friends from removing the clone.  */
+  new_node->symbol.externally_visible
+    = old_node->symbol.externally_visible;
+  TREE_PUBLIC (new_node->symbol.decl) = TREE_PUBLIC (old_node->symbol.decl);
+  old_node->has_simd_clones = true;
+
+  /* The function cgraph_function_versioning() will force the new
+     symbol local.  Undo this, and inherit external visability from
+     the old node.  */
+  new_node->local.local = old_node->local.local;
+  new_node->symbol.externally_visible = old_node->symbol.externally_visible;
+
+  return new_node;
+}
+
+/* Adjust the return type of the given function to its appropriate
+   vector counterpart.  Returns a simd array to be used throughout the
+   function as a return value.  */
+
+static tree
+simd_clone_adjust_return_type (struct cgraph_node *node)
+{
+  tree fndecl = node->symbol.decl;
+  tree orig_rettype = TREE_TYPE (TREE_TYPE (fndecl));
+
+  tree t = DECL_RESULT (fndecl);
+  /* Adjust the DECL_RESULT.  */
+  if (TREE_TYPE (t) != void_type_node)
+    {
+      TREE_TYPE (t)
+	= build_vector_type (TREE_TYPE (t), node->simdclone->simdlen);
+      DECL_MODE (t) = TYPE_MODE (TREE_TYPE (t));
+    }
+  /* Adjust the function return type.  */
+  if (TREE_TYPE (TREE_TYPE (fndecl)) != void_type_node)
+    {
+      TREE_TYPE (fndecl)
+	= copy_node (TREE_TYPE (fndecl));
+      TREE_TYPE (TREE_TYPE (fndecl))
+	= copy_node (TREE_TYPE (TREE_TYPE (fndecl)));
+      TREE_TYPE (TREE_TYPE (fndecl))
+	= build_vector_type (TREE_TYPE (TREE_TYPE (fndecl)),
+			     node->simdclone->simdlen);
+    }
+
+  /* Set up a SIMD array to use as the return value.  */
+  tree retval;
+  if (orig_rettype != void_type_node)
+    {
+      retval
+	= create_tmp_var_raw (build_array_type_nelts (orig_rettype,
+						      node->simdclone->simdlen),
+			      "retval");
+      gimple_add_tmp_var (retval);
+    }
+  else
+    retval = NULL;
+  return retval;
+}
+
+/* Each vector argument has a corresponding array to be used locally
+   as part of the eventual loop.  Create such temporary array and
+   return it.
+
+   PREFIX is the prefix to be used for the temporary.
+
+   TYPE is the inner element type.
+
+   SIMDLEN is the number of elements.  */
+
+static tree
+create_tmp_simd_array (const char *prefix, tree type, int simdlen)
+{
+  tree atype = build_array_type_nelts (type, simdlen);
+  tree avar = create_tmp_var_raw (atype, prefix);
+  gimple_add_tmp_var (avar);
+  return avar;
+}
+
+/* Modify the function argument types to their corresponding vector
+   counterparts if appropriate.  Also, create one array for each simd
+   argument to be used locally when using the function arguments as
+   part of the loop.
+
+   NODE is the function whose arguments are to be adjusted.
+
+   Returns an adjustment vector that will be filled describing how the
+   argument types will be adjusted.  */
+
+static ipa_parm_adjustment_vec
+simd_clone_adjust_argument_types (struct cgraph_node *node)
+{
+  argno_map args (node->symbol.decl);
+  ipa_parm_adjustment_vec adjustments;
+
+  adjustments.create (args.length ());
+  unsigned i;
+  for (i = 0; i < node->simdclone->nargs; ++i)
+    {
+      struct ipa_parm_adjustment adj;
+
+      memset (&adj, 0, sizeof (adj));
+      tree parm = args[i];
+      adj.base_index = i;
+      adj.base = parm;
+
+      node->simdclone->args[i].orig_arg = parm;
+
+      if (node->simdclone->args[i].uniform
+	  || node->simdclone->args[i].linear_stride != LINEAR_STRIDE_NO)
+	{
+	  /* No adjustment necessary for scalar arguments.  */
+	  adj.copy_param = 1;
+	}
+      else
+	{
+	  adj.simdlen = node->simdclone->simdlen;
+	  if (POINTER_TYPE_P (TREE_TYPE (parm)))
+	    adj.by_ref = 1;
+	  adj.type = TREE_TYPE (parm);
+
+	  node->simdclone->args[i].simd_array
+	    = create_tmp_simd_array (IDENTIFIER_POINTER (DECL_NAME (parm)),
+				     TREE_TYPE (parm),
+				     node->simdclone->simdlen);
+	}
+      adjustments.quick_push (adj);
+    }
+
+  if (node->simdclone->inbranch)
+    {
+      struct ipa_parm_adjustment adj;
+
+      memset (&adj, 0, sizeof (adj));
+      adj.new_param = 1;
+      adj.new_arg_prefix = "mask";
+      adj.base_index = i;
+      adj.type
+	= build_vector_type (integer_type_node, node->simdclone->simdlen);
+      adjustments.safe_push (adj);
+
+      /* We have previously allocated one extra entry for the mask.  Use
+	 it and fill it.  */
+      struct simd_clone *sc = node->simdclone;
+      sc->nargs++;
+      sc->args[i].orig_arg = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL,
+					 integer_type_node);
+      sc->args[i].simd_array
+	= create_tmp_simd_array ("mask", integer_type_node, sc->simdlen);
+    }
+
+  ipa_modify_formal_parameters (node->symbol.decl, adjustments, "simd");
+  return adjustments;
+}
+
+/* Initialize and copy the function arguments in NODE to their
+   corresponding local simd arrays.  Returns a fresh gimple_seq with
+   the instruction sequence generated.  */
+
+static gimple_seq
+simd_clone_init_simd_arrays (struct cgraph_node *node,
+			     ipa_parm_adjustment_vec adjustments)
+{
+  gimple_seq seq = NULL;
+  unsigned i = 0;
+
+  for (tree arg = DECL_ARGUMENTS (node->symbol.decl);
+       arg;
+       arg = DECL_CHAIN (arg), i++)
+    {
+      if (adjustments[i].copy_param)
+	continue;
+
+      node->simdclone->args[i].vector_arg = arg;
+
+      tree array = node->simdclone->args[i].simd_array;
+      tree t = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (array), arg);
+      t = build2 (MODIFY_EXPR, TREE_TYPE (array), array, t);
+      gimplify_and_add (t, &seq);
+    }
+  return seq;
+}
+
+/* Traverse the function body and perform all modifications as
+   described in ADJUSTMENTS.  At function return, ADJUSTMENTS will be
+   modified such that the replacement/reduction value will now be an
+   offset into the corresponding simd_array.
+
+   This function will replace all function argument uses with their
+   corresponding simd array elements, and ajust the return values
+   accordingly.  */
+
+static void
+ipa_simd_modify_function_body (struct cgraph_node *node,
+			       ipa_parm_adjustment_vec adjustments,
+			       tree retval_array, tree iter)
+{
+  basic_block bb;
+
+  /* Re-use the adjustments array, but this time use it to replace
+     every function argument use to an offset into the corresponding
+     simd_array.  */
+  for (unsigned i = 0; i < node->simdclone->nargs; ++i)
+    {
+      if (!node->simdclone->args[i].vector_arg)
+	continue;
+
+      tree basetype = TREE_TYPE (node->simdclone->args[i].orig_arg);
+      adjustments[i].reduction
+	= build4 (ARRAY_REF,
+		  basetype,
+		  node->simdclone->args[i].simd_array,
+		  iter,
+		  NULL_TREE, NULL_TREE);
+    }
+
+  FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (node->symbol.decl))
+    {
+      gimple_stmt_iterator gsi;
+
+      gsi = gsi_start_bb (bb);
+      while (!gsi_end_p (gsi))
+	{
+	  gimple stmt = gsi_stmt (gsi);
+	  bool modified = false;
+	  tree *t;
+	  unsigned i;
+
+	  switch (gimple_code (stmt))
+	    {
+	    case GIMPLE_RETURN:
+	      {
+		/* Replace `return foo' by `retval_array[iter] = foo'.  */
+		tree old_retval = gimple_return_retval (stmt);
+		if (!old_retval)
+		  break;
+		stmt = gimple_build_assign (build4 (ARRAY_REF,
+						    TREE_TYPE (old_retval),
+						    retval_array, iter,
+						    NULL, NULL),
+					    old_retval);
+		gsi_replace (&gsi, stmt, true);
+		modified = true;
+		break;
+	      }
+
+	    case GIMPLE_ASSIGN:
+	      t = gimple_assign_lhs_ptr (stmt);
+	      modified |= sra_ipa_modify_expr (t, false, adjustments);
+	      for (i = 0; i < gimple_num_ops (stmt); ++i)
+		{
+		  t = gimple_op_ptr (stmt, i);
+		  modified |= sra_ipa_modify_expr (t, false, adjustments);
+		}
+	      break;
+
+	    case GIMPLE_CALL:
+	      /* Operands must be processed before the lhs.  */
+	      for (i = 0; i < gimple_call_num_args (stmt); i++)
+		{
+		  t = gimple_call_arg_ptr (stmt, i);
+		  modified |= sra_ipa_modify_expr (t, true, adjustments);
+		}
+
+	      if (gimple_call_lhs (stmt))
+		{
+		  t = gimple_call_lhs_ptr (stmt);
+		  modified |= sra_ipa_modify_expr (t, false, adjustments);
+		}
+	      break;
+
+	    case GIMPLE_ASM:
+	      for (i = 0; i < gimple_asm_ninputs (stmt); i++)
+		{
+		  t = &TREE_VALUE (gimple_asm_input_op (stmt, i));
+		  modified |= sra_ipa_modify_expr (t, true, adjustments);
+		}
+	      for (i = 0; i < gimple_asm_noutputs (stmt); i++)
+		{
+		  t = &TREE_VALUE (gimple_asm_output_op (stmt, i));
+		  modified |= sra_ipa_modify_expr (t, false, adjustments);
+		}
+	      break;
+
+	    default:
+	      for (i = 0; i < gimple_num_ops (stmt); ++i)
+		{
+		  t = gimple_op_ptr (stmt, i);
+		  if (*t)
+		    modified |= sra_ipa_modify_expr (t, true, adjustments);
+		}
+	      break;
+	    }
+
+	  if (modified)
+	    {
+	      gimple_regimplify_operands (stmt, &gsi);
+	      update_stmt (stmt);
+	      if (maybe_clean_eh_stmt (stmt))
+		gimple_purge_dead_eh_edges (gimple_bb (stmt));
+	    }
+	  gsi_next (&gsi);
+	}
+    }
+}
+
+/* Adjust the argument types in NODE to their appropriate vector
+   counterparts.  */
+
+static void
+simd_clone_adjust (struct cgraph_node *node)
+{
+  // FIXME: -------ABI STUFF--------
+  //   0. Create clones for externs.
+  //   1. Arguments split across multiple args.
+  //   2. Which registers to pass in.
+  //   3. Get mangling correct for x86*
+  //   4. Agree on what default clones to generate when simdlen() missing.
+
+  // FIXME: ------- VECTORIZER CHANGES -------
+  //   1. At least the easy, notinbranch cases.
+  //   2. Handle linear/uniform arguments in get_simd_clone/etc.
+  //   3. Bail on non-SLP vectorizer mode.
+
+  // FIXME:  __attribute__((target (something))) if needed
+
+  // FIXME: get_simd_clone() needs optimization.
+
+  push_cfun (DECL_STRUCT_FUNCTION (node->symbol.decl));
+
+  tree retval = simd_clone_adjust_return_type (node);
+  ipa_parm_adjustment_vec adjustments = simd_clone_adjust_argument_types (node);
+
+  struct gimplify_ctx gctx;
+  push_gimplify_context (&gctx);
+
+  gimple_seq seq = simd_clone_init_simd_arrays (node, adjustments);
+
+  /* Adjust all uses of vector arguments accordingly.  Adjust all
+     return values accordingly.  */
+  tree iter = create_tmp_var (unsigned_type_node, "iter");
+  ipa_simd_modify_function_body (node, adjustments, retval, iter);
+
+  /* Initialize the iteration variable.  */
+  gimple g
+    = gimple_build_assign_with_ops (INTEGER_CST,
+				    iter,
+				    build_int_cst (unsigned_type_node, 0),
+				    NULL_TREE);
+  gimple_seq_add_stmt (&seq, g);
+
+  basic_block entry_bb = single_succ (ENTRY_BLOCK_PTR);
+  basic_block body_bb = split_block_after_labels (entry_bb)->dest;
+  gimple_stmt_iterator gsi = gsi_after_labels (entry_bb);
+  /* Insert the SIMD array and iv initialization at function
+     entry.  */
+  gsi_insert_seq_before (&gsi, seq, GSI_NEW_STMT);
+
+  pop_gimplify_context (NULL);
+
+  /* Create a new BB right before the original exit BB, to hold the
+     iteration increment and the condition/branch.  */
+  basic_block orig_exit = EDGE_PRED (EXIT_BLOCK_PTR, 0)->src;
+  basic_block incr_bb = create_empty_bb (orig_exit);
+  /* The succ of orig_exit was EXIT_BLOCK_PTR, with an empty flag.
+     Set it now to be a FALLTHRU_EDGE.  */
+  gcc_assert (EDGE_COUNT (orig_exit->succs) == 1);
+  EDGE_SUCC (orig_exit, 0)->flags |= EDGE_FALLTHRU;
+  for (unsigned i = 0; i < EDGE_COUNT (EXIT_BLOCK_PTR->preds); ++i)
+    {
+      edge e = EDGE_PRED (EXIT_BLOCK_PTR, i);
+      redirect_edge_succ (e, incr_bb);
+    }
+  edge e = make_edge (incr_bb, EXIT_BLOCK_PTR, 0);
+  e->probability = REG_BR_PROB_BASE;
+  gsi = gsi_last_bb (incr_bb);
+  g = gimple_build_assign_with_ops (PLUS_EXPR, iter, iter,
+				    build_int_cst (unsigned_type_node, 1));
+  gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
+
+  /* Mostly annotate the loop for the vectorizer (the rest is done below).  */
+  struct loop *loop = alloc_loop ();
+  cfun->has_force_vect_loops = true;
+  loop->safelen = node->simdclone->simdlen;
+  loop->force_vect = true;
+  loop->header = body_bb;
+  add_bb_to_loop (incr_bb, loop);
+
+  /* Branch around the body if the mask applies.  */
+  if (node->simdclone->inbranch)
+    {
+      gimple_stmt_iterator gsi = gsi_last_bb (loop->header);
+      tree mask_array
+	= node->simdclone->args[node->simdclone->nargs - 1].simd_array;
+      tree mask = create_tmp_var (integer_type_node, NULL);
+      tree aref = build4 (ARRAY_REF,
+			  integer_type_node,
+			  mask_array, iter,
+			  NULL, NULL);
+      g = gimple_build_assign (mask, aref);
+      gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
+
+      g = gimple_build_cond (EQ_EXPR, mask, integer_zero_node,
+			     NULL, NULL);
+      gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
+      make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
+      FALLTHRU_EDGE (loop->header)->flags = EDGE_FALSE_VALUE;
+    }
+
+  /* Generate the condition.  */
+  g = gimple_build_cond (LT_EXPR,
+			 iter,
+			 build_int_cst (unsigned_type_node,
+					node->simdclone->simdlen),
+			 NULL, NULL);
+  gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
+  e = split_block (incr_bb, gsi_stmt (gsi));
+  basic_block latch_bb = e->dest;
+  basic_block new_exit_bb = e->dest;
+  new_exit_bb = split_block (latch_bb, NULL)->dest;
+  loop->latch = latch_bb;
+
+  redirect_edge_succ (FALLTHRU_EDGE (latch_bb), body_bb);
+
+  make_edge (incr_bb, new_exit_bb, EDGE_FALSE_VALUE);
+  /* The successor of incr_bb is already pointing to latch_bb; just
+     change the flags.
+     make_edge (incr_bb, latch_bb, EDGE_TRUE_VALUE);  */
+  FALLTHRU_EDGE (incr_bb)->flags = EDGE_TRUE_VALUE;
+
+  /* Generate the new return.  */
+  gsi = gsi_last_bb (new_exit_bb);
+  if (retval)
+    {
+      retval = build1 (VIEW_CONVERT_EXPR,
+		       TREE_TYPE (TREE_TYPE (node->symbol.decl)),
+		       retval);
+      retval = force_gimple_operand_gsi (&gsi, retval, true, NULL,
+					 false, GSI_CONTINUE_LINKING);
+    }
+  g = gimple_build_return (retval);
+  gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  add_loop (loop, loop->header->loop_father);
+
+  pop_cfun ();
+}
+
+/* If the function in NODE is tagged as an elemental SIMD function,
+   create the appropriate SIMD clones.  */
+
+static void
+expand_simd_clones (struct cgraph_node *node)
+{
+  if (cgraph_function_body_availability (node) < AVAIL_OVERWRITABLE)
+    return;
+
+  tree attr = lookup_attribute ("omp declare simd",
+				DECL_ATTRIBUTES (node->symbol.decl));
+  if (!attr)
+    return;
+  do
+    {
+      struct cgraph_node *new_node = simd_clone_create (node);
+
+      bool inbranch_clause_specified;
+      simd_clone_clauses_extract (new_node, TREE_VALUE (attr),
+				  &inbranch_clause_specified);
+      simd_clone_compute_vecsize_and_simdlen (new_node);
+      simd_clone_mangle (node, new_node);
+      simd_clone_adjust (new_node);
+
+      /* If no inbranch clause was specified, we need both variants.
+	 We have already created the not-in-branch version above, by
+	 virtue of .inbranch being clear.  Create the masked in-branch
+	 version.  */
+      if (!inbranch_clause_specified)
+	{
+	  struct cgraph_node *n = simd_clone_create (node);
+	  struct simd_clone *clone
+	    = simd_clone_struct_alloc (new_node->simdclone->nargs);
+	  simd_clone_struct_copy (clone, new_node->simdclone);
+	  clone->inbranch = 1;
+	  n->simdclone = clone;
+	  simd_clone_mangle (node, n);
+	  simd_clone_adjust (n);
+	}
+    }
+  while ((attr = lookup_attribute ("omp declare simd", TREE_CHAIN (attr))));
+}
+
+/* Entry point for IPA simd clone creation pass.  */
+
+static unsigned int
+ipa_omp_simd_clone (void)
+{
+  struct cgraph_node *node;
+  FOR_EACH_DEFINED_FUNCTION (node)
+    expand_simd_clones (node);
+  return 0;
+}
+
+namespace {
+
+const pass_data pass_data_omp_simd_clone =
+{
+  SIMPLE_IPA_PASS,		/* type */
+  "simdclone",			/* name */
+  OPTGROUP_NONE,		/* optinfo_flags */
+  true,				/* has_gate */
+  true,				/* has_execute */
+  TV_NONE,			/* tv_id */
+  ( PROP_ssa | PROP_cfg ),	/* properties_required */
+  0,				/* properties_provided */
+  0,				/* properties_destroyed */
+  0,				/* todo_flags_start */
+  (TODO_update_ssa | TODO_verify_all | TODO_cleanup_cfg), /* todo_flags_finish */
+};
+
+class pass_omp_simd_clone : public simple_ipa_opt_pass
+{
+public:
+  pass_omp_simd_clone(gcc::context *ctxt)
+    : simple_ipa_opt_pass(pass_data_omp_simd_clone, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  bool gate () { return flag_openmp || flag_enable_cilkplus; }
+  unsigned int execute () { return ipa_omp_simd_clone (); }
+};
+
+} // anon namespace
+
+simple_ipa_opt_pass *
+make_pass_omp_simd_clone (gcc::context *ctxt)
+{
+  return new pass_omp_simd_clone (ctxt);
+}
 
 #include "gt-omp-low.h"
diff --git a/gcc/passes.def b/gcc/passes.def
index 84eb3f3..6803399 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -97,6 +97,7 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_feedback_split_functions);
   POP_INSERT_PASSES ()
   NEXT_PASS (pass_ipa_increase_alignment);
+  NEXT_PASS (pass_omp_simd_clone);
   NEXT_PASS (pass_ipa_tm);
   NEXT_PASS (pass_ipa_lower_emutls);
   TERMINATE_PASS_LIST ()
diff --git a/gcc/target.def b/gcc/target.def
index 6de513f..92cbd73 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1508,6 +1508,35 @@ hook_int_uint_mode_1)
 
 HOOK_VECTOR_END (sched)
 
+/* Functions relating to Cilk Plus.  */
+#undef HOOK_PREFIX
+#define HOOK_PREFIX "TARGET_CILKPLUS_"
+HOOK_VECTOR (TARGET_CILKPLUS, cilkplus)
+
+DEFHOOK
+(default_vecsize_mangle,
+"This hook should return the default mangling character when no vector\n\
+size can be determined by examining the  Cilk Plus @code{processor} clause.\n\
+This is as specified in the Intel Vector ABI document.\n\
+\n\
+This hook, as well as @code{max_vector_size_for_isa} below must be set\n\
+to support the Cilk Plus @code{processor} clause.\n\
+\n\
+The only argument is a @var{cgraph_node} containing the clone.",
+char, (struct cgraph_node *), NULL)
+
+DEFHOOK
+(vecsize_for_mangle,
+"This hook returns the maximum hardware vector size in bits for a given\n\
+mangling character.  The character is as described in Intel's\n\
+Vector ABI (see @var{ISA} character in the section on mangling).\n\
+\n\
+This hook must be defined in order to support the Cilk Plus @code{processor}\n\
+clause.",
+unsigned int, (char), NULL)
+
+HOOK_VECTOR_END (cilkplus)
+
 /* Functions relating to vectorization.  */
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_VECTORIZE_"
diff --git a/gcc/testsuite/gcc.dg/gomp/simd-clones-1.c b/gcc/testsuite/gcc.dg/gomp/simd-clones-1.c
new file mode 100644
index 0000000..486b67a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/simd-clones-1.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -fdump-tree-optimized -O3" } */
+
+/* Test that functions that have SIMD clone counterparts are not
+   cloned by IPA-cp.  For example, special_add() below has SIMD clones
+   created for it.  However, if IPA-cp later decides to clone a
+   specialization of special_add(x, 666) when analyzing fillit(), we
+   will forever keep the vectorizer from using the SIMD versions of
+   special_add in a loop.
+
+   If IPA-CP gets taught how to adjust the SIMD clones as well, this
+   test could be removed.  */
+
+#pragma omp declare simd simdlen(4)
+static int  __attribute__ ((noinline))
+special_add (int x, int y)
+{
+  if (y == 666)
+    return x + y + 123;
+  else
+    return x + y;
+}
+
+void fillit(int *tot)
+{
+  int i;
+
+  for (i=0; i < 10000; ++i)
+    tot[i] = special_add (i, 666);
+}
+
+/* { dg-final { scan-tree-dump-not "special_add.constprop" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/gomp/simd-clones-2.c b/gcc/testsuite/gcc.dg/gomp/simd-clones-2.c
new file mode 100644
index 0000000..8ab3131
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/simd-clones-2.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-fopenmp -fdump-tree-optimized -O -msse2" } */
+
+#pragma omp declare simd inbranch uniform(c) linear(b:66)   // addit.simdclone.2
+#pragma omp declare simd notinbranch aligned(c:32) // addit.simdclone.1
+int addit(int a, int b, int c)
+{
+  return a + b;
+}
+
+#pragma omp declare simd uniform(a) aligned(a:32) linear(k:1) notinbranch
+float setArray(float *a, float x, int k)
+{
+  a[k] = a[k] + x;
+  return a[k];
+}
+
+/* { dg-final { scan-tree-dump "clone.0 \\(_ZGVxN4ua32vl_setArray" "optimized" } } */
+/* { dg-final { scan-tree-dump "clone.1 \\(_ZGVxN4vvva32_addit" "optimized" } } */
+/* { dg-final { scan-tree-dump "clone.2 \\(_ZGVxM4vl66u_addit" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/gomp/simd-clones-3.c b/gcc/testsuite/gcc.dg/gomp/simd-clones-3.c
new file mode 100644
index 0000000..a7fc2a5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/simd-clones-3.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-fopenmp -fdump-tree-optimized -O2 -msse2" } */
+
+/* Test that if there is no *inbranch clauses, that both the masked and
+   the unmasked version are created.  */
+
+#pragma omp declare simd
+int addit(int a, int b, int c)
+{
+  return a + b;
+}
+
+/* { dg-final { scan-tree-dump "clone.* \\(_ZGVxN4vvv_addit" "optimized" } } */
+/* { dg-final { scan-tree-dump "clone.* \\(_ZGVxM4vvv_addit" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/gomp/simd-clones-4.c b/gcc/testsuite/gcc.dg/gomp/simd-clones-4.c
new file mode 100644
index 0000000..893f44e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/simd-clones-4.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-fopenmp" } */
+
+#pragma omp declare simd simdlen(4) notinbranch
+int f2 (int a, int b)
+{
+  if (a > 5)
+    return a + b;
+  else
+    return a - b;
+}
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index a14c7e0..c6b0c72 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -886,6 +886,9 @@ struct GTY(()) tree_base {
        CALL_ALLOCA_FOR_VAR_P in
            CALL_EXPR
 
+       OMP_CLAUSE_LINEAR_VARIABLE_STRIDE in
+	   OMP_CLAUSE_LINEAR
+
    side_effects_flag:
 
        TREE_SIDE_EFFECTS in
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index e72fe9a..41e8794 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -473,6 +473,7 @@ extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_lto_finish_out (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_tm (gcc::context *ctxt);
+extern simple_ipa_opt_pass *make_pass_omp_simd_clone (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_profile (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_cdtor_merge (gcc::context *ctxt);
 
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 82520ba..8d61c35 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -4486,7 +4486,7 @@ replace_removed_params_ssa_names (gimple stmt,
    incompatibility issues to the caller.  Return true iff the expression
    was modified. */
 
-static bool
+bool
 sra_ipa_modify_expr (tree *expr, bool convert,
 		     ipa_parm_adjustment_vec adjustments)
 {
@@ -4624,7 +4624,7 @@ sra_ipa_modify_assign (gimple *stmt_ptr, gimple_stmt_iterator *gsi,
 /* Traverse the function body and all modifications as described in
    ADJUSTMENTS.  Return true iff the CFG has been changed.  */
 
-static bool
+bool
 ipa_sra_modify_function_body (ipa_parm_adjustment_vec adjustments)
 {
   bool cfg_changed = false;
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 7d9c9ed..f50a5b1 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1688,6 +1688,16 @@ tree
 vectorizable_function (gimple call, tree vectype_out, tree vectype_in)
 {
   tree fndecl = gimple_call_fndecl (call);
+  struct cgraph_node *node = cgraph_get_node (fndecl);
+
+  if (node->has_simd_clones)
+    {
+      struct cgraph_node *clone = get_simd_clone (node, vectype_out);
+      if (clone)
+	return clone->symbol.decl;
+      /* Fall through in case we ever add support for
+	 non-built-ins.  */
+    }
 
   /* We only handle functions that do not read or clobber memory -- i.e.
      const or novops ones.  */
@@ -1758,10 +1768,12 @@ vectorizable_call (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
   vectype_in = NULL_TREE;
   nargs = gimple_call_num_args (stmt);
 
-  /* Bail out if the function has more than three arguments, we do not have
-     interesting builtin functions to vectorize with more than two arguments
-     except for fma.  No arguments is also not good.  */
-  if (nargs == 0 || nargs > 3)
+  /* Bail out if the function has more than three arguments.  We do
+     not have interesting builtin functions to vectorize with more
+     than two arguments except for fma (unless we have SIMD clones).
+     No arguments is also not good.  */
+  struct cgraph_node *node = cgraph_get_node (gimple_call_fndecl (stmt));
+  if (nargs == 0 || (!node->has_simd_clones && nargs > 3))
     return false;
 
   /* Ignore the argument of IFN_GOMP_SIMD_LANE, it is magic.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 8200c2e..aacb22b 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1318,6 +1318,10 @@ extern void protected_set_expr_location (tree, location_t);
 #define OMP_CLAUSE_LINEAR_NO_COPYOUT(NODE) \
   TREE_PRIVATE (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LINEAR))
 
+/* True if a LINEAR clause has a stride that is variable.  */
+#define OMP_CLAUSE_LINEAR_VARIABLE_STRIDE(NODE) \
+  TREE_PROTECTED (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LINEAR))
+
 #define OMP_CLAUSE_LINEAR_STEP(NODE) \
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LINEAR), 1)
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-01  3:05 RFC: simd enabled functions (omp declare simd / elementals) Aldy Hernandez
@ 2013-11-01 10:57 ` Jakub Jelinek
  2013-11-01 12:35 ` Jakub Jelinek
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Jakub Jelinek @ 2013-11-01 10:57 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: Richard Henderson, Richard Biener, Jan Hubicka, Martin Jambor,
	gcc-patches, Iyer, Balaji V

Hi!

On Thu, Oct 31, 2013 at 10:04:45PM -0500, Aldy Hernandez wrote:
> Hello gentlemen.  I'm CCing all of you, because each of you can
> provide valuable feedback to various parts of the compiler which I
> touch.  I have sprinkled love notes with your names throughout the
> post :).

Thanks for working on this.

> 	* Makefile.in (omp-low.o): Depend on PRETTY_PRINT_H and IPA_PROP_H.

You aren't changing Makefile.in anymore ;).

> +/* Given a NODE, return a compatible SIMD clone returning `vectype'.
> +   If none found, NULL is returned.  */
> +
> +struct cgraph_node *
> +get_simd_clone (struct cgraph_node *node, tree vectype)
> +{
> +  if (!node->has_simd_clones)
> +    return NULL;
> +
> +  /* FIXME: What to do with linear/uniform arguments.  */
> +
> +  /* FIXME: Nasty kludge until we figure out where to put the clone
> +     list-- perhaps, next_sibling_clone/prev_sibling_clone in
> +     cgraph_node ??.  */
> +  struct cgraph_node *t;
> +  FOR_EACH_FUNCTION (t)
> +    if (t->simdclone_of == node
> +	/* No inbranch vectorization for now.  */
> +	&& !t->simdclone->inbranch
> +	&& types_compatible_p (TREE_TYPE (TREE_TYPE (t->symbol.decl)),
> +			       vectype))
> +      break;
> +  return t;
> +}

You definitely need some quick way to find the simd clones, and you really
can't do this here anyway, because you have to check all arguments, return
type might be missing etc., so it needs to be done by vectorizable_call
itself.

> +  /* If this is a SIMD clone, this points to the SIMD specific
> +     information for it.  */
> +  struct simd_clone *simdclone;
> +
> +  /* If this is a SIMD clone, this points to the original scalar
> +     function.  */
> +  struct cgraph_node *simdclone_of;

Can't you put this into the simd_clone structure, in order not to waste
memory for functions which don't have simd clones?  So, you'd use
t->simdclone && t->simdclone->clone_of == node or similar (if you need it at
all, I guess better is to add a struct cgraph_node *simd_clones;
and put the prev/next pointers in struct simd_clone).

Let me start with two testcases:

test1.c:
int array[1000];

#pragma omp declare simd simdlen(4) notinbranch
#pragma omp declare simd simdlen(4) notinbranch uniform(b)
#pragma omp declare simd simdlen(8) notinbranch
#pragma omp declare simd simdlen(8) notinbranch uniform(b)
__attribute__((noinline)) int
foo (int a, int b)
{
  if (a == b)
    return 5;
  else
    return 6;
}

void
bar ()
{
  int i;
  for (i = 0; i < 1000; ++i)
    array[i] = foo (i, 123);
  for (i = 0; i < 1000; ++i)
    array[i] = foo (i, array[i]);
}

test2.c:
int array[1000];

#pragma omp declare simd simdlen(4) notinbranch aligned(a:16) uniform(a) linear(b)
#pragma omp declare simd simdlen(4) notinbranch aligned(a:32) uniform(a) linear(b)
#pragma omp declare simd simdlen(8) notinbranch aligned(a:16) uniform(a) linear(b)
#pragma omp declare simd simdlen(8) notinbranch aligned(a:32) uniform(a) linear(b)
__attribute__((noinline)) void
foo (int *a, int b, int c)
{
  a[b] = c;
}

void
bar ()
{
  int i;
  for (i = 0; i < 1000; ++i)
    foo (array, i, i * array[i]);
}

On test1.c -O3 -fopenmp {,-mavx,-mavx2}, you can see:
test1.c: In function ‘foo.simdclone.0’:
test1.c:8:1: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
 foo (int a, int b)
 ^
test1.c:8:1: warning: AVX vector argument without AVX enabled changes the ABI [enabled by default]
and the manglings are without -mavx{,2}
_ZGVxN8vu_foo
_ZGVxN8vv_foo
_ZGVxN4vu_foo
_ZGVxN4vv_foo
while with it _ZGVy* (surprisingly not Y).  As discussed earlier, we don't
want to decide which clones to create based on compiler options, we probably
want to create (unless told by Cilk+ processor clauses otherwise) entry
points for all the ABIs, just try to create the ones not matching compiler
options as small as possible, and use target attribute for those too
and say for _ZGVxN8v?_foo we need to pass the vector arguments in two
vector(4) int parameters rather than one vector(8) as it is done now (that
is why the above warnings and notes are printed).  But you know this
already... ;).

The second testcase currently ICEs I guess during simd cloning, just wanted
to make it clear that while simd clones without any arguments probably don't
make any sense (other than const, but those really should be hoisted out of
the loop much earlier), simd clones with no return value make sense.

> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -1688,6 +1688,16 @@ tree
>  vectorizable_function (gimple call, tree vectype_out, tree vectype_in)
>  {
>    tree fndecl = gimple_call_fndecl (call);
> +  struct cgraph_node *node = cgraph_get_node (fndecl);
> +
> +  if (node->has_simd_clones)
> +    {
> +      struct cgraph_node *clone = get_simd_clone (node, vectype_out);
> +      if (clone)
> +	return clone->symbol.decl;
> +      /* Fall through in case we ever add support for
> +	 non-built-ins.  */
> +    }

I think it is a bad idea to do this in vectorizable_function, as I said
earlier keying this on the result type won't work for functions returning
void, and more importantly, you really need access to detailed info about
all the arguments for finding out if you have a suitable clone, and
as test1.c shows, also for selection of the best of the clones if more than
one is suitable.  In test1.c, in the first loop the uniform variants are
better over the ones without uniform second argument, though if the uniform
ones would be missing, then you could use even the ones with vv arguments,
because you can just pass a vector constant (or broadcast scalar element
into the vector).  Similarly, in test2.c, you want to check
get_pointer_alignment of the pointer, and if it is >= 32, you can use
the clones with aligned(:32) (and with (:16), but (:32) is supposedly
better), if it is >= 16, you can only use the clones with (:16), if it is <
16, you can't use anything.

> @@ -1758,10 +1768,12 @@ vectorizable_call (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
>    vectype_in = NULL_TREE;
>    nargs = gimple_call_num_args (stmt);
>  
> -  /* Bail out if the function has more than three arguments, we do not have
> -     interesting builtin functions to vectorize with more than two arguments
> -     except for fma.  No arguments is also not good.  */
> -  if (nargs == 0 || nargs > 3)
> +  /* Bail out if the function has more than three arguments.  We do
> +     not have interesting builtin functions to vectorize with more
> +     than two arguments except for fma (unless we have SIMD clones).
> +     No arguments is also not good.  */
> +  struct cgraph_node *node = cgraph_get_node (gimple_call_fndecl (stmt));
> +  if (nargs == 0 || (!node->has_simd_clones && nargs > 3))
>      return false;

In the end, I think for the vectorization of elemental function calls
it might be best to write a new function, vectorizable_simd_clone_call
or similar, because if we add all the support for into vectorizable_call,
it might be unmaintainable.  Normal vectorizable_calls rely on all the input
arguments being of the same type, the return type must be not be void,
but the return type doesn't have to be the same as argument types (which
means NARROW, NONE or WIDEN kind of expansion).
The simd clones can have arbitrary argument types, so the concept of
narrowing/widening etc. doesn't work well in that case.  So I'd probably go
for starting with copy of vectorizable_call, call it at the same spots as
vectorizable_call (after it), then remove the same argument restrictions and
start updating it to do the analysis of suitable clones and some priority
mechanism on which simd clones are best (give bonus points for uniform
arguments if the argument is uniform, linear if it is linear, for highest
alignment, notinbranch/inbranch, simdlen, etc.).

So, e.g.
      /* We can only handle calls with arguments of the same type.  */
      if (rhs_type
          && !types_compatible_p (rhs_type, TREE_TYPE (op)))
        {
          if (dump_enabled_p ())
            dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                             "argument types differ.\n");
          return false;
        }
shouldn't be done for the simd clones, vectype_in should go, vectype_out
should be renamed to vectype and set to first argument's? type if result is
void or unused.

      if (!vect_is_simple_use_1 (op, stmt, loop_vinfo, bb_vinfo,
                                 &def_stmt, &def, &dt[i], &opvectype))
        {
          if (dump_enabled_p ())
            dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                             "use not simple.\n");
          return false;
        }

(note, right now in the patch this is buffer overflow for nargs > 3, for
simd clones you want to probably dynamically allocate the dt array,
supposedly also remember opvectype for each argument).
If opvectype above is NULL, then that argument can be passed to
uniform arguments (or of course broadcasted into vector to vector
arguments).  To find out if for opvectype != NULL you can pass it to
linear argument, supposedly you could call simple_iv, and compare the
iv.step computed by it if it returned true with the linear step.
To check alignment, supposedly if it is uniform (opvectype == NULL),
you'd just call get_pointer_alignment if it is a pointer, otherwise
maybe give up for now?  I mean, if it is e.g. linear, it would be harder,
you'd need to know if peeling for alignment will be needed and only if
known not to be needed you could simple_iv it and see if base has right
get_pointer_alignment and step multiplied by vectorization factor
keeps the alignment right.  Though, if you need to insert more than one
call of the simdclone, you'd need to verify that it is right for all the
calls.  Right now we have no way to express conditional calls in the
ifconverted IL, so right now we'll punt on all conditional calls, but
at least the vectorizer can fall back (with much lower priority) to
inbranch elementals if notinbranch doesn't exist (just pass all ones mask).

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-01  3:05 RFC: simd enabled functions (omp declare simd / elementals) Aldy Hernandez
  2013-11-01 10:57 ` Jakub Jelinek
@ 2013-11-01 12:35 ` Jakub Jelinek
  2013-11-01 16:51   ` Jakub Jelinek
  2013-11-04 10:37 ` Richard Biener
  2013-11-07 16:43 ` Martin Jambor
  3 siblings, 1 reply; 11+ messages in thread
From: Jakub Jelinek @ 2013-11-01 12:35 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: Richard Henderson, Richard Biener, Jan Hubicka, Martin Jambor,
	gcc-patches, Iyer, Balaji V

Hi!

One more thing:

On Thu, Oct 31, 2013 at 10:04:45PM -0500, Aldy Hernandez wrote:
> +enum linear_stride_type {
> +  LINEAR_STRIDE_NO,
> +  LINEAR_STRIDE_YES_CONSTANT,
> +  LINEAR_STRIDE_YES_VARIABLE
> +};
...
> +  /* If the linear stride is a constant, `linear_stride' is
> +     LINEAR_STRIDE_YES_CONSTANT, and `linear_stride_num' holds
> +     the numeric stride.
> +
> +     If the linear stride is variable, `linear_stride' is
> +     LINEAR_STRIDE_YES_VARIABLE, and `linear_stride_num' contains
> +     the function argument containing the stride (as an index into the
> +     function arguments starting at 0).
> +
> +     Otherwise, `linear_stride' is LINEAR_STRIDE_NO and
> +     `linear_stride_num' is unused.  */
> +  enum linear_stride_type linear_stride;
> +  unsigned HOST_WIDE_INT linear_stride_num;
> +
> +  /* Variable alignment if available, otherwise 0.  */
> +  unsigned int alignment;
> +
> +  /* True if variable is uniform.  */
> +  unsigned int uniform : 1;
> +};

At least the OpenMP standard disallows one argument to be both
uniform and linear (but, apparently I forgot to diagnose, fixed thusly,
committed to trunk), and even if Cilk+ didn't disallow it explicitly,
linear together with uniform doesn't make sense (unless we consider
uniform a special case of linear with step 0).  The Intel mangling PDF
doesn't allow the same argument to be both uniform and linear either.
So, IMHO much better would be to have an enum simd_clone_arg_type which
would be
enum simd_clone_arg_type
{
  SIMD_CLONE_ARG_TYPE_VECTOR,
  SIMD_CLONE_ARG_TYPE_UNIFORM,
  SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP,
  SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP
};
drop uniform bitfield, change linear_stride_num to
say union { unsigned HOST_WIDE_INT linear_constant_step; int linear_step_argno; };
or similar.

2013-11-01  Jakub Jelinek  <jakub@redhat.com>

	* c-typeck.c (c_finish_omp_clauses) <case OMP_CLAUSE_UNIFORM>: Go to
	check_dup_generic at the end, unless remove is true.
	(c_finish_omp_clauses) <case OMP_CLAUSE_REDUCTION>: Add break; after
	remove = true;.
	(c_finish_omp_clauses) <case OMP_CLAUSE_COPYIN>: Likewise.

	* semantics.c (finish_omp_clauses) <case OMP_CLAUSE_UNIFORM>: Go to
	check_dup_generic at the end, unless remove is true.
	(finish_omp_clauses) <case OMP_CLAUSE_LINEAR>: Add break; after
	remove = true;.

	* gcc.dg/gomp/declare-simd-2.c (f12, f13, f14, f15, f16, f17): New
	tests.
	* g++.dg/gomp/declare-simd-2.C (f15, f16, f17, f18, f19, f20): New
	tests.

--- gcc/c/c-typeck.c.jj	2013-10-31 20:05:44.000000000 +0100
+++ gcc/c/c-typeck.c	2013-11-01 13:07:20.330051746 +0100
@@ -11316,6 +11316,7 @@ c_finish_omp_clauses (tree clauses)
 			    "%qE has invalid type for %<reduction(%s)%>",
 			    t, r_name);
 		  remove = true;
+		  break;
 		}
 	    }
 	  else if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) == error_mark_node)
@@ -11323,6 +11324,7 @@ c_finish_omp_clauses (tree clauses)
 	      error_at (OMP_CLAUSE_LOCATION (c),
 			"user defined reduction not found for %qD", t);
 	      remove = true;
+	      break;
 	    }
 	  else if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c))
 	    {
@@ -11406,6 +11408,7 @@ c_finish_omp_clauses (tree clauses)
 	      error_at (OMP_CLAUSE_LOCATION (c),
 			"%qE must be %<threadprivate%> for %<copyin%>", t);
 	      remove = true;
+	      break;
 	    }
 	  goto check_dup_generic;
 
@@ -11615,8 +11618,9 @@ c_finish_omp_clauses (tree clauses)
 		error_at (OMP_CLAUSE_LOCATION (c),
 			  "%qE is not an argument in %<uniform%> clause", t);
 	      remove = true;
+	      break;
 	    }
-	  break;
+	  goto check_dup_generic;
 
 	case OMP_CLAUSE_NOWAIT:
 	  if (copyprivate_seen)
--- gcc/cp/semantics.c.jj	2013-10-31 20:05:44.000000000 +0100
+++ gcc/cp/semantics.c	2013-11-01 13:10:29.006068213 +0100
@@ -5188,12 +5188,16 @@ finish_omp_clauses (tree clauses)
 	  if (t == NULL_TREE)
 	    t = integer_one_node;
 	  if (t == error_mark_node)
-	    remove = true;
+	    {
+	      remove = true;
+	      break;
+	    }
 	  else if (!type_dependent_expression_p (t)
 		   && !INTEGRAL_TYPE_P (TREE_TYPE (t)))
 	    {
 	      error ("linear step expression must be integral");
 	      remove = true;
+	      break;
 	    }
 	  else
 	    {
@@ -5210,7 +5214,10 @@ finish_omp_clauses (tree clauses)
 					   MINUS_EXPR, sizetype, t,
 					   OMP_CLAUSE_DECL (c));
 		      if (t == error_mark_node)
-			remove = true;
+			{
+			  remove = true;
+			  break;
+			}
 		    }
 		}
 	      OMP_CLAUSE_LINEAR_STEP (c) = t;
@@ -5626,8 +5633,9 @@ finish_omp_clauses (tree clauses)
 	      else
 		error ("%qE is not an argument in %<uniform%> clause", t);
 	      remove = true;
+	      break;
 	    }
-	  break;
+	  goto check_dup_generic;
 
 	case OMP_CLAUSE_NOWAIT:
 	case OMP_CLAUSE_ORDERED:
--- gcc/testsuite/gcc.dg/gomp/declare-simd-2.c.jj	2013-10-31 20:05:44.000000000 +0100
+++ gcc/testsuite/gcc.dg/gomp/declare-simd-2.c	2013-11-01 12:56:58.124140252 +0100
@@ -39,3 +39,16 @@ struct D { int d; };
 
 #pragma omp declare simd aligned (e)    /* { dg-error "neither a pointer nor an array" } */
 int fn11 (struct D e);   
+
+#pragma omp declare simd linear(a:7) uniform(a)	/* { dg-error "appears more than once" } */
+int f12 (int a);
+#pragma omp declare simd linear(a) linear(a)	/* { dg-error "appears more than once" } */
+int f13 (int a);
+#pragma omp declare simd linear(a) linear(a:7)	/* { dg-error "appears more than once" } */
+int f14 (int a);
+#pragma omp declare simd linear(a:6) linear(a:6)/* { dg-error "appears more than once" } */
+int f15 (int a);
+#pragma omp declare simd uniform(a) uniform(a)	/* { dg-error "appears more than once" } */
+int f16 (int a);
+#pragma omp declare simd uniform(a) aligned (a: 32)
+int f17 (int *a);
--- gcc/testsuite/g++.dg/gomp/declare-simd-2.C.jj	2013-10-31 20:05:44.000000000 +0100
+++ gcc/testsuite/g++.dg/gomp/declare-simd-2.C	2013-11-01 12:58:08.277839837 +0100
@@ -82,4 +82,17 @@ int fn14 (double &d);
 #pragma omp declare simd aligned (e)	// { dg-error "neither a pointer nor an array" }
 int fn14 (D e);
 
+#pragma omp declare simd linear(a:7) uniform(a)	// { dg-error "appears more than once" }
+int f15 (int a);
+#pragma omp declare simd linear(a) linear(a)	// { dg-error "appears more than once" }
+int f16 (int a);
+#pragma omp declare simd linear(a) linear(a:7)	// { dg-error "appears more than once" }
+int f17 (int a);
+#pragma omp declare simd linear(a:6) linear(a:6)// { dg-error "appears more than once" }
+int f18 (int a);
+#pragma omp declare simd uniform(a) uniform(a)	// { dg-error "appears more than once" }
+int f19 (int a);
+#pragma omp declare simd uniform(a) aligned (a: 32)
+int f20 (int *a);
+
 // { dg-error "has no member" "" { target *-*-* } 61 }

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-01 12:35 ` Jakub Jelinek
@ 2013-11-01 16:51   ` Jakub Jelinek
  0 siblings, 0 replies; 11+ messages in thread
From: Jakub Jelinek @ 2013-11-01 16:51 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: Richard Henderson, Richard Biener, Jan Hubicka, Martin Jambor,
	gcc-patches, Iyer, Balaji V

On Fri, Nov 01, 2013 at 01:35:35PM +0100, Jakub Jelinek wrote:
> So, IMHO much better would be to have an enum simd_clone_arg_type which
> would be
> enum simd_clone_arg_type
> {
>   SIMD_CLONE_ARG_TYPE_VECTOR,
>   SIMD_CLONE_ARG_TYPE_UNIFORM,
>   SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP,
>   SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP
> };
> drop uniform bitfield, change linear_stride_num to
> say union { unsigned HOST_WIDE_INT linear_constant_step; int linear_step_argno; };
> or similar.

I've committed this to gomp-4_0-branch as follow-up to your patch:

2013-11-01  Jakub Jelinek  <jakub@redhat.com>

	* cgraph.h (enum linear_stride_type): Remove.
	(enum simd_clone_arg_type): New.
	(struct simd_clone_arg): Remove linear_stride, linear_stride_num
	and uniform fields.  Add arg_type and linear_step.
	* omp-low.c (simd_clone_struct_copy): Formatting.
	(simd_clone_struct_alloc): Likewise.  Use size_t.
	(simd_clone_clauses_extract, simd_clone_compute_base_data_type,
	simd_clone_adjust_argument_types): Adjust for struct simd_clone_arg
	changes.
	(simd_clone_mangle): Likewise.  Handle negative linear step.

--- gcc/cgraph.h.jj	2013-11-01 17:11:42.000000000 +0100
+++ gcc/cgraph.h	2013-11-01 17:24:59.472995514 +0100
@@ -250,10 +250,12 @@ struct GTY(()) cgraph_clone_info
   bitmap combined_args_to_skip;
 };
 
-enum linear_stride_type {
-  LINEAR_STRIDE_NO,
-  LINEAR_STRIDE_YES_CONSTANT,
-  LINEAR_STRIDE_YES_VARIABLE
+enum simd_clone_arg_type
+{
+  SIMD_CLONE_ARG_TYPE_VECTOR,
+  SIMD_CLONE_ARG_TYPE_UNIFORM,
+  SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP,
+  SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP
 };
 
 /* Function arguments in the original function of a SIMD clone.
@@ -282,28 +284,17 @@ struct GTY(()) simd_clone_arg {
   tree simd_array;
 
   /* A SIMD clone's argument can be either linear (constant or
-     variable), uniform, or vector.  If the argument is neither linear
-     or uniform, the default is vector.  */
+     variable), uniform, or vector.  */
+  enum simd_clone_arg_type arg_type;
 
-  /* If the linear stride is a constant, `linear_stride' is
-     LINEAR_STRIDE_YES_CONSTANT, and `linear_stride_num' holds
-     the numeric stride.
-
-     If the linear stride is variable, `linear_stride' is
-     LINEAR_STRIDE_YES_VARIABLE, and `linear_stride_num' contains
-     the function argument containing the stride (as an index into the
-     function arguments starting at 0).
-
-     Otherwise, `linear_stride' is LINEAR_STRIDE_NO and
-     `linear_stride_num' is unused.  */
-  enum linear_stride_type linear_stride;
-  unsigned HOST_WIDE_INT linear_stride_num;
+  /* For arg_type SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP this is
+     the constant linear step, if arg_type is
+     SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP, this is index of
+     the uniform argument holding the step, otherwise 0.  */
+  HOST_WIDE_INT linear_step;
 
   /* Variable alignment if available, otherwise 0.  */
   unsigned int alignment;
-
-  /* True if variable is uniform.  */
-  unsigned int uniform : 1;
 };
 
 /* Specific data for a SIMD function clone.  */
--- gcc/omp-low.c.jj	2013-11-01 17:11:42.000000000 +0100
+++ gcc/omp-low.c	2013-11-01 17:41:26.635904034 +0100
@@ -10561,8 +10561,8 @@ static struct simd_clone *
 simd_clone_struct_alloc (int nargs)
 {
   struct simd_clone *clone_info;
-  int len = sizeof (struct simd_clone)
-    + nargs * sizeof (struct simd_clone_arg);
+  size_t len = (sizeof (struct simd_clone)
+		+ nargs * sizeof (struct simd_clone_arg));
   clone_info = ggc_alloc_cleared_simd_clone_stat (len PASS_MEM_STAT);
   return clone_info;
 }
@@ -10572,8 +10572,8 @@ simd_clone_struct_alloc (int nargs)
 static inline void
 simd_clone_struct_copy (struct simd_clone *to, struct simd_clone *from)
 {
-  memcpy (to, from, sizeof (struct simd_clone)
-	  + from->nargs * sizeof (struct simd_clone_arg));
+  memcpy (to, from, (sizeof (struct simd_clone)
+		     + from->nargs * sizeof (struct simd_clone_arg)));
 }
 
 /* Given a simd clone in NEW_NODE, extract the simd specific
@@ -10637,31 +10637,27 @@ simd_clone_clauses_extract (struct cgrap
 	    int argno = TREE_INT_CST_LOW (decl);
 	    if (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE (t))
 	      {
-		clone_info->args[argno].linear_stride
-		  = LINEAR_STRIDE_YES_VARIABLE;
-		clone_info->args[argno].linear_stride_num
-		  = TREE_INT_CST_LOW (step);
-		gcc_assert (!TREE_INT_CST_HIGH (step));
+		clone_info->args[argno].arg_type
+		  = SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP;
+		clone_info->args[argno].linear_step
+		  = tree_low_cst (step, 0);
+		gcc_assert (clone_info->args[argno].linear_step >= 0
+			    && clone_info->args[argno].linear_step < n);
 	      }
 	    else
 	      {
-		if (TREE_INT_CST_HIGH (step))
-		  {
-		    /* It looks like this can't really happen, since the
-		       front-ends generally issue:
-
-		       warning: integer constant is too large for its type.
-
-		       But let's assume somehow we got past all that.  */
-		    warning_at (DECL_SOURCE_LOCATION (decl), 0,
-				"ignoring large linear step");
-		  }
+		if (!host_integerp (step, 0))
+		  warning_at (OMP_CLAUSE_LOCATION (t), 0,
+			      "ignoring large linear step");
+		else if (integer_zerop (step))
+		  warning_at (OMP_CLAUSE_LOCATION (t), 0,
+			      "ignoring zero linear step");
 		else
 		  {
-		    clone_info->args[argno].linear_stride
-		      = LINEAR_STRIDE_YES_CONSTANT;
-		    clone_info->args[argno].linear_stride_num
-		      = TREE_INT_CST_LOW (step);
+		    clone_info->args[argno].arg_type
+		      = SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP;
+		    clone_info->args[argno].linear_step
+		      = tree_low_cst (step, 0);
 		  }
 	      }
 	    break;
@@ -10670,7 +10666,8 @@ simd_clone_clauses_extract (struct cgrap
 	  {
 	    tree decl = OMP_CLAUSE_DECL (t);
 	    int argno = tree_low_cst (decl, 1);
-	    clone_info->args[argno].uniform = 1;
+	    clone_info->args[argno].arg_type
+	      = SIMD_CLONE_ARG_TYPE_UNIFORM;
 	    break;
 	  }
 	case OMP_CLAUSE_ALIGNED:
@@ -10731,14 +10728,12 @@ simd_clone_compute_base_data_type (struc
     {
       argno_map map (fndecl);
       for (unsigned int i = 0; i < new_node->simdclone->nargs; ++i)
-	{
-	  struct simd_clone_arg arg = new_node->simdclone->args[i];
-	  if (!arg.uniform && arg.linear_stride == LINEAR_STRIDE_NO)
-	    {
-	      type = TREE_TYPE (map[i]);
-	      break;
-	    }
-	}
+	if (new_node->simdclone->args[i].arg_type
+	    == SIMD_CLONE_ARG_TYPE_VECTOR)
+	  {
+	    type = TREE_TYPE (map[i]);
+	    break;
+	  }
     }
 
   /* c) If the characteristic data type determined by a) or b) above
@@ -10824,20 +10819,25 @@ simd_clone_mangle (struct cgraph_node *o
     {
       struct simd_clone_arg arg = new_node->simdclone->args[n];
 
-      if (arg.uniform)
+      if (arg.arg_type == SIMD_CLONE_ARG_TYPE_UNIFORM)
 	pp_character (&pp, 'u');
-      else if (arg.linear_stride == LINEAR_STRIDE_YES_CONSTANT)
+      else if (arg.arg_type == SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP)
 	{
-	  gcc_assert (arg.linear_stride_num != 0);
+	  gcc_assert (arg.linear_step != 0);
 	  pp_character (&pp, 'l');
-	  if (arg.linear_stride_num > 1)
-	    pp_unsigned_wide_integer (&pp,
-				      arg.linear_stride_num);
+	  if (arg.linear_step > 0)
+	    pp_unsigned_wide_integer (&pp, arg.linear_step);
+	  else
+	    {
+	      pp_character (&pp, 'n');
+	      pp_unsigned_wide_integer (&pp, (-(unsigned HOST_WIDE_INT)
+					      arg.linear_step));
+	    }
 	}
-      else if (arg.linear_stride == LINEAR_STRIDE_YES_VARIABLE)
+      else if (arg.arg_type == SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP)
 	{
 	  pp_character (&pp, 's');
-	  pp_unsigned_wide_integer (&pp, arg.linear_stride_num);
+	  pp_unsigned_wide_integer (&pp, arg.linear_step);
 	}
       else
 	pp_character (&pp, 'v');
@@ -10975,8 +10975,7 @@ simd_clone_adjust_argument_types (struct
 
       node->simdclone->args[i].orig_arg = parm;
 
-      if (node->simdclone->args[i].uniform
-	  || node->simdclone->args[i].linear_stride != LINEAR_STRIDE_NO)
+      if (node->simdclone->args[i].arg_type != SIMD_CLONE_ARG_TYPE_VECTOR)
 	{
 	  /* No adjustment necessary for scalar arguments.  */
 	  adj.copy_param = 1;

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-01  3:05 RFC: simd enabled functions (omp declare simd / elementals) Aldy Hernandez
  2013-11-01 10:57 ` Jakub Jelinek
  2013-11-01 12:35 ` Jakub Jelinek
@ 2013-11-04 10:37 ` Richard Biener
  2013-11-04 10:58   ` Jakub Jelinek
  2013-11-07 16:43 ` Martin Jambor
  3 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2013-11-04 10:37 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: Jakub Jelinek, Richard Henderson, Jan Hubicka, Martin Jambor,
	gcc-patches, Iyer, Balaji V

On Fri, Nov 1, 2013 at 4:04 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
> Hello gentlemen.  I'm CCing all of you, because each of you can provide
> valuable feedback to various parts of the compiler which I touch.  I have
> sprinkled love notes with your names throughout the post :).
>
> This is a patch against the gomp4 branch.  It provides initial support for
> simd-enabled functions which are "#pragma omp declare simd" in the OpenMP
> world and elementals in Cilk Plus nomenclature.  The parsing bits for OpenMP
> are already in trunk, but they are silently ignored.  This patch aims to
> remedy the situation.  The Cilk Plus parsing bits, OTOH, are not ready, but
> could trivially be adapted to use this infrastructure (see below).
>
> I would like to at least get this into the gomp4 branch for now, because I
> am accumulating far too many changes locally.
>
> The main idea is that for a simd annotated function, we can create one or
> more cloned vector variants of a scalar function that can later be used by
> the vectorizer.
>
> For a simple example with multiple returns...
>
> #pragma omp declare simd simdlen(4) notinbranch
> int foo (int a, int b)
> {
>   if (a == b)
>     return 555;
>   else
>     return 666;
> }
>
> ...we would generate with this patch (unoptimized):

Just a quick question, queueing the thread for later review (aww, queue
back at >100 threads again - I can't get any work done anymore :().

What does #pragma omp declare simd guarantee about memory
side-effects and memory use in the function?  That is, unless the
function can be safely annotated with the 'const' attribute the
whole thing is useless for the vectorizer.

Thanks,
Richard.

> foo.simdclone.0 (vector(4) int simd.4, vector(4) int simd.5)
> {
>   unsigned int iter.6;
>   int b.3[4];
>   int a.2[4];
>   int retval.1[4];
>   int _3;
>   int _5;
>   int _6;
>   vector(4) int _7;
>
>   <bb 2>:
>   a.2 = VIEW_CONVERT_EXPR<int[4]>(simd.4);
>   b.3 = VIEW_CONVERT_EXPR<int[4]>(simd.5);
>   iter.6_12 = 0;
>
>   <bb 3>:
>   # iter.6_9 = PHI <iter.6_12(2), iter.6_14(6)>
>   _5 = a.2[iter.6_9];
>   _6 = b.3[iter.6_9];
>   if (_5 == _6)
>     goto <bb 5>;
>   else
>     goto <bb 4>;
>
>   <bb 4>:
>
>   <bb 5>:
>   # _3 = PHI <555(3), 666(4)>
>   retval.1[iter.6_9] = _3;
>   iter.6_14 = iter.6_9 + 1;
>   if (iter.6_14 < 4)
>     goto <bb 6>;
>   else
>     goto <bb 7>;
>
>   <bb 6>:
>   goto <bb 3>;
>
>   <bb 7>:
>   _7 = VIEW_CONVERT_EXPR<vector(4) int>(retval.1);
>   return _7;
>
> }
>
> The new loop is properly created and annotated with loop->force_vect=true
> and loop->safelen set.
>
> A possible use may be:
>
> int array[1000];
> void bar ()
> {
>   int i;
>   for (i=0; i < 1000; ++i)
>     array[i] = foo(i, 123);
> }
>
> In which case, we would use the simd clone if available:
>
> bar ()
> {
>   vector(4) int vect_cst_.21;
>   vector(4) int vect_i_6.20;
>   vector(4) int * vectp_array.19;
>   vector(4) int * vectp_array.18;
>   vector(4) int vect_cst_.17;
>   vector(4) int vect__4.16;
>   vector(4) int vect_vec_iv_.15;
>   vector(4) int vect_cst_.14;
>   vector(4) int vect_cst_.13;
>   int stmp_var_.12;
>   int i;
>   unsigned int ivtmp_1;
>   int _4;
>   unsigned int ivtmp_7;
>   unsigned int ivtmp_20;
>   unsigned int ivtmp_21;
>
>   <bb 2>:
>   vect_cst_.13_8 = { 0, 1, 2, 3 };
>   vect_cst_.14_2 = { 4, 4, 4, 4 };
>   vect_cst_.17_13 = { 123, 123, 123, 123 };
>   vectp_array.19_15 = &array;
>   vect_cst_.21_5 = { 1, 1, 1, 1 };
>   goto <bb 4>;
>
>   <bb 3>:
>
>   <bb 4>:
>   # i_9 = PHI <i_6(3), 0(2)>
>   # ivtmp_1 = PHI <ivtmp_7(3), 1000(2)>
>   # vect_vec_iv_.15_11 = PHI <vect_vec_iv_.15_12(3), vect_cst_.13_8(2)>
>   # vectp_array.18_16 = PHI <vectp_array.18_17(3), vectp_array.19_15(2)>
>   # ivtmp_20 = PHI <ivtmp_21(3), 0(2)>
>   vect_vec_iv_.15_12 = vect_vec_iv_.15_11 + vect_cst_.14_2;
>   vect__4.16_14 = foo.simdclone.0 (vect_vec_iv_.15_11, vect_cst_.17_13);
>   _4 = 0;
>   MEM[(int *)vectp_array.18_16] = vect__4.16_14;
>   vect_i_6.20_19 = vect_vec_iv_.15_11 + vect_cst_.21_5;
>   i_6 = i_9 + 1;
>   ivtmp_7 = ivtmp_1 - 1;
>   vectp_array.18_17 = vectp_array.18_16 + 16;
>   ivtmp_21 = ivtmp_20 + 1;
>   if (ivtmp_21 < 250)
>     goto <bb 3>;
>   else
>     goto <bb 5>;
>
>   <bb 5>:
>   return;
>
> }
>
> That's the idea.
>
> Some of the ABI issues still need to be resolved (mangling for avx-512, what
> to do with non x86 architectures, what (if any) default clones will be
> created when no vector length is specified, etc etc), but the main
> functionality can be seen above.
>
> Uniform and linear parameters (which are passed as scalars) are still not
> handled.  Also, Jakub mentioned that with the current vectorizer we probably
> can't make good use of the inbranch/masked clones.  I have a laundry list of
> missing things prepended by // FIXME if anyone is curious.
>
> I'd like some feedback from y'all in your respective areas, since this
> touches a few places besides OpenMP.  For instance...
>
> [Honza] Where do you suggest I place a list of simd clones for a particular
> (scalar) function?  Right now I have added a simdclone_of field in
> cgraph_node and am (temporarily) serially scanning all functions in
> get_simd_clone().  This is obviously inefficient.  I didn't know whether to
> use the current next_sibling_clone/etc fields or create my own.  I tried
> using clone_of, and that caused some havoc so I'd like some feedback.
>
> [Martin] I have adapted the ipa_parm_adjustment infrastructure to allow
> adding new arguments out of the blue like you mentioned was missing in
> ipa-prop.h.  I have also added support for creating vectors of arguments.
> Could you take a look at my changes to ipa-prop.[ch]?
>
> [Martin] I need to add new arguments in the case of inbranch clones, which
> add an additional vector with a mask as the last argument:  For the
> following:
>
> #pragma omp declare simd simdlen(4) inbranch
> int foo (int a)
> {
>   return a + 1234;
> }
>
> ...we would generate a clone with:
>
> vector(4) int
> foo.simdclone.0 (vector(4) int simd.4, vector(4) int mask.5)
>
> I thought it best to enhance ipa_modify_formal_parameters() and associated
> machinery than to add the new argument ad-hoc.  We already have enough ways
> of doing tree and cgraph versioning in the compiler ;-).
>
> [Richi] I would appreciate feedback on the vectorizer and the infrastructure
> as a whole.  Do keep in mind that this is a work in progress :).
>
> [Balaji] This patch would provide the infrastructure that can be used by the
> Cilk Plus elementals.  When this is complete, all that would be missing is
> the parser.  You would have to tag the original function with "omp declare
> simd" and "cilk plus elemental" attributes.  See simd_clone_clauses_extract.
>
> [Jakub/rth]: As usual, valuable feedback on OpenMP and everything else is
> greatly appreciated.
>
> Oh yeah, there are many more changes that would ideally be needed in the
> vectorizer.
>
> Fire away!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-04 10:37 ` Richard Biener
@ 2013-11-04 10:58   ` Jakub Jelinek
  0 siblings, 0 replies; 11+ messages in thread
From: Jakub Jelinek @ 2013-11-04 10:58 UTC (permalink / raw)
  To: Richard Biener
  Cc: Aldy Hernandez, Richard Henderson, Jan Hubicka, Martin Jambor,
	gcc-patches, Iyer, Balaji V

Hi!

On Mon, Nov 04, 2013 at 11:37:19AM +0100, Richard Biener wrote:
> Just a quick question, queueing the thread for later review (aww, queue
> back at >100 threads again - I can't get any work done anymore :().
> 
> What does #pragma omp declare simd guarantee about memory
> side-effects and memory use in the function?  That is, unless the
> function can be safely annotated with the 'const' attribute the
> whole thing is useless for the vectorizer.

The main restriction is:

"The execution of the function or subroutine cannot have any side effects that would
alter its execution for concurrent iterations of a SIMD chunk."

There are other restrictions, omp declare simd functions can't call
setjmp/longjmp or throw, etc.

The functions certainly can't be in the general case annotated with the
const attribute, they can read and write memory, but the user is responsible
for making sure that the effects or running the function sequentially as
part of non-vectorized loop are the same as running the simd clone of that
as part of a vectorized loop with the given simdlen vectorization factor.
So it is certainly meant to be used by the vectorizer, after all, that is
it's sole purpose.

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-01  3:05 RFC: simd enabled functions (omp declare simd / elementals) Aldy Hernandez
                   ` (2 preceding siblings ...)
  2013-11-04 10:37 ` Richard Biener
@ 2013-11-07 16:43 ` Martin Jambor
  2013-11-08 18:21   ` Aldy Hernandez
  3 siblings, 1 reply; 11+ messages in thread
From: Martin Jambor @ 2013-11-07 16:43 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: Jakub Jelinek, Richard Henderson, Richard Biener, Jan Hubicka,
	gcc-patches, Iyer, Balaji V

Hi,

On Thu, Oct 31, 2013 at 10:04:45PM -0500, Aldy Hernandez wrote:
> Hello gentlemen.  I'm CCing all of you, because each of you can
> provide valuable feedback to various parts of the compiler which I
> touch.  I have sprinkled love notes with your names throughout the
> post :).

sorry it took me so long, for various reasons I out of my control I've
accumulated quite a backlog of email and tasks last week and it took
me a lot of time to chew through it all.

...

> [Martin] I have adapted the ipa_parm_adjustment infrastructure to
> allow adding new arguments out of the blue like you mentioned was
> missing in ipa-prop.h.  I have also added support for creating
> vectors of arguments.  Could you take a look at my changes to
> ipa-prop.[ch]?

Sure, though I have only looked at ipa-* and tree-sra.c stuff.  I do
not have any real objections but would suggest a few amendments.  

I am glad this is becoming a useful infrastructure rather than just a
part of IPA-SRA.  Note that while ipa_combine_adjustments is not used
from anywhere and thus probably buggy anyway, it should in theory be
able to process new_param adjustments too.  Can you please at least
put a "not implemented" assert there?  (The reason is that the plan
still is to replace args_to_skip bitmaps in cgraphclones.c by
adjustments one day and we do need to combine clones.)

> 
> [Martin] I need to add new arguments in the case of inbranch clones,
> which add an additional vector with a mask as the last argument:
> For the following:
> 
> #pragma omp declare simd simdlen(4) inbranch
> int foo (int a)
> {
>   return a + 1234;
> }
> 
> ...we would generate a clone with:
> 
> vector(4) int
> foo.simdclone.0 (vector(4) int simd.4, vector(4) int mask.5)
> 
> I thought it best to enhance ipa_modify_formal_parameters() and
> associated machinery than to add the new argument ad-hoc.  We
> already have enough ways of doing tree and cgraph versioning in the
> compiler ;-).
> 

...

> gcc/ChangeLog.elementals
> 
> 	* Makefile.in (omp-low.o): Depend on PRETTY_PRINT_H and IPA_PROP_H.
> 	* tree-vect-stmts.c (vectorizable_call): Allow > 3 arguments when
> 	a SIMD clone may be available.
> 	(vectorizable_function): Use SIMD clone if available.
> 	* ipa-cp.c (determine_versionability): Nodes with SIMD clones are
> 	not versionable.
> 	* ggc.h (ggc_alloc_cleared_simd_clone_stat): New.
> 	* cgraph.h (enum linear_stride_type): New.
> 	(struct simd_clone_arg): New.
> 	(struct simd_clone): New.
> 	(struct cgraph_node): Add simdclone and simdclone_of fields.
> 	(get_simd_clone): Protoize.
> 	* cgraph.c (get_simd_clone): New.
> 	Add `has_simd_clones' field.
> 	* ipa-cp.c (determine_versionability): Disallow functions with
> 	simd clones.

(This looks like a repeated entry.)

> 	* ipa-prop.h (ipa_sra_modify_function_body): Protoize.
> 	(sra_ipa_modify_expr): Same.
> 	(struct ipa_parm_adjustment): Add new_arg_prefix and new_param
> 	fields.  Document their use.
> 	* ipa-prop.c (ipa_modify_formal_parameters): Handle creating brand
> 	new parameters and minor cleanups.
> 	* omp-low.c: Add new pass_omp_simd_clone support code.
> 	(make_pass_omp_simd_clone): New.
> 	(pass_data_omp_simd_clone): Declare.
> 	(class pass_omp_simd_clone): Declare.
> 	(vecsize_mangle): New.
> 	(ipa_omp_simd_clone): New.
> 	(simd_clone_clauses_extract): New.
> 	(simd_clone_compute_base_data_type): New.
> 	(simd_clone_compute_vecsize_and_simdlen): New.
> 	(simd_clone_create): New.
> 	(simd_clone_adjust_return_type): New.
> 	(simd_clone_adjust_return_types): New.
> 	(simd_clone_adjust): New.
> 	(simd_clone_init_simd_arrays): New.
> 	(ipa_simd_modify_function_body): New.
> 	(simd_clone_mangle): New.
> 	(simd_clone_struct_alloc): New.
> 	(simd_clone_struct_copy): New.
> 	(class argno_map): New.
> 	(argno_map::argno_map(tree)): New.
> 	(argno_map::~argno_map): New.
> 	(argno_map::operator []): New.
> 	(argno_map::length): New.
> 	(expand_simd_clones): New.
> 	(create_tmp_simd_array): New.
> 	* tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): New.
> 	* tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document.
> 	* tree-pass.h (make_pass_omp_simd_clone): New.
> 	* passes.def (pass_omp_simd_clone): New.
> 	* target.def: Define new hook prefix "TARGET_CILKPLUS_".
> 	(default_vecsize_mangle): New.
> 	(vecsize_for_mangle): New.
> 	* doc/tm.texi.in: Add placeholder for
> 	TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE and
> 	TARGET_CILKPLUS_VECSIZE_FOR_MANGLE.
> 	* tree-sra.c (sra_ipa_modify_expr): Remove static modifier.
> 	(ipa_sra_modify_function_body): Same.
> 	* tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define.
> 	* doc/tm.texi: Regenerate.
> 	* config/i386/i386.c (ix86_cilkplus_default_vecsize_mangle): New.
> 	(ix86_cilkplus_vecsize_for_mangle): New.
> 	(TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE): New.
> 	(TARGET_CILKPLUS_VECSIZE_FOR_MANGLE): New.
> 

...

> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index c38ba82..faae080 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -446,6 +446,13 @@ determine_versionability (struct cgraph_node *node)
>      reason = "not a tree_versionable_function";
>    else if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
>      reason = "insufficient body availability";
> +  else if (node->has_simd_clones)
> +    {
> +      /* Ideally we should clone the SIMD clones themselves and create
> +	 vector copies of them, so IPA-cp and SIMD clones can happily
> +	 coexist, but that may not be worth the effort.  */
> +      reason = "function has SIMD clones";
> +    }

Lets hope we will eventually fix this in some followup :-)


>  
>    if (reason && dump_file && !node->symbol.alias && !node->thunk.thunk_p)
>      fprintf (dump_file, "Function %s/%i is not versionable, reason: %s.\n",
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 2fbc9d4..0c20dc6 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -3361,24 +3361,18 @@ void
>  ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  			      const char *synth_parm_prefix)
>  {
> -  vec<tree> oparms, otypes;
> -  tree orig_type, new_type = NULL;
> -  tree old_arg_types, t, new_arg_types = NULL;
> -  tree parm, *link = &DECL_ARGUMENTS (fndecl);
> -  int i, len = adjustments.length ();
> -  tree new_reversed = NULL;
> -  bool care_for_types, last_parm_void;
> -
>    if (!synth_parm_prefix)
>      synth_parm_prefix = "SYNTH";
>  
> -  oparms = ipa_get_vector_of_formal_parms (fndecl);
> -  orig_type = TREE_TYPE (fndecl);
> -  old_arg_types = TYPE_ARG_TYPES (orig_type);
> +  vec<tree> oparms = ipa_get_vector_of_formal_parms (fndecl);
> +  tree orig_type = TREE_TYPE (fndecl);
> +  tree old_arg_types = TYPE_ARG_TYPES (orig_type);
>  
>    /* The following test is an ugly hack, some functions simply don't have any
>       arguments in their type.  This is probably a bug but well... */
> -  care_for_types = (old_arg_types != NULL_TREE);
> +  bool care_for_types = (old_arg_types != NULL_TREE);
> +  bool last_parm_void;
> +  vec<tree> otypes;
>    if (care_for_types)
>      {
>        last_parm_void = (TREE_VALUE (tree_last (old_arg_types))
> @@ -3395,13 +3389,20 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>        otypes.create (0);
>      }
>  
> -  for (i = 0; i < len; i++)
> +  int len = adjustments.length ();
> +  tree *link = &DECL_ARGUMENTS (fndecl);
> +  tree new_arg_types = NULL;
> +  for (int i = 0; i < len; i++)
>      {
>        struct ipa_parm_adjustment *adj;
>        gcc_assert (link);
>  
>        adj = &adjustments[i];
> -      parm = oparms[adj->base_index];
> +      tree parm;
> +      if (adj->new_param)

I don't know what I was thinking when I invented copy_param and
remove_param as multiple flags rather than a single enum, I probably
wasn't thinking at all.  I can change it myself as a followup if you
have more pressing tasks now.  Meanwhile, can you gcc_checking_assert
that at most one flag is set at appropriate places?

> +	parm = NULL;
> +      else
> +	parm = oparms[adj->base_index];
>        adj->base = parm;

I do not think it makes sense for new parameters to have a base which
is basically the old decl.  Do you have any reasons for not setting it
to NULL?

>  
>        if (adj->copy_param)
> @@ -3417,8 +3418,18 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  	  tree new_parm;
>  	  tree ptype;
>  

> -	  if (adj->by_ref)
> -	    ptype = build_pointer_type (adj->type);

Please add gcc_checking_assert (!adj->by_ref || adj->simdlen == 0)
here...

> +	  if (adj->simdlen)
> +	    {
> +	      /* If we have a non-null simdlen but by_ref is true, we
> +		 want a vector of pointers.  Build the vector of
> +		 pointers here, not a pointer to a vector in the
> +		 adj->by_ref case below.  */
> +	      ptype = build_vector_type (adj->type, adj->simdlen);
> +	    }
> +	  else if (adj->by_ref)

...or remove this else and be able to build a pointer to the vector
if by_ref is true.

> +	    {
> +	      ptype = build_pointer_type (adj->type);
> +	    }
>  	  else
>  	    ptype = adj->type;
>  
> @@ -3427,8 +3438,9 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  
>  	  new_parm = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL_TREE,
>  				 ptype);
> -	  DECL_NAME (new_parm) = create_tmp_var_name (synth_parm_prefix);
> -
> +	  const char *prefix
> +	    = adj->new_param ? adj->new_arg_prefix : synth_parm_prefix;

Can we perhaps get rid of synth_parm_prefix then and just have
adj->new_arg_prefix?  It's not particularly important but this is
weird.


> +	  DECL_NAME (new_parm) = create_tmp_var_name (prefix);
>  	  DECL_ARTIFICIAL (new_parm) = 1;
>  	  DECL_ARG_TYPE (new_parm) = ptype;
>  	  DECL_CONTEXT (new_parm) = fndecl;
> @@ -3436,17 +3448,20 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  	  DECL_IGNORED_P (new_parm) = 1;
>  	  layout_decl (new_parm, 0);
>  
> -	  adj->base = parm;
> +	  if (adj->new_param)
> +	    adj->base = new_parm;

Again, shouldn't this be NULL?

> +	  else
> +	    adj->base = parm;
>  	  adj->reduction = new_parm;
>  
>  	  *link = new_parm;
> -
>  	  link = &DECL_CHAIN (new_parm);
>  	}
>      }
>  
>    *link = NULL_TREE;
>  
> +  tree new_reversed = NULL;
>    if (care_for_types)
>      {
>        new_reversed = nreverse (new_arg_types);
> @@ -3464,6 +3479,7 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>       Exception is METHOD_TYPEs must have THIS argument.
>       When we are asked to remove it, we need to build new FUNCTION_TYPE
>       instead.  */
> +  tree new_type = NULL;
>    if (TREE_CODE (orig_type) != METHOD_TYPE
>         || (adjustments[0].copy_param
>  	  && adjustments[0].base_index == 0))
> @@ -3489,7 +3505,7 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  
>    /* This is a new type, not a copy of an old type.  Need to reassociate
>       variants.  We can handle everything except the main variant lazily.  */
> -  t = TYPE_MAIN_VARIANT (orig_type);
> +  tree t = TYPE_MAIN_VARIANT (orig_type);
>    if (orig_type != t)
>      {
>        TYPE_MAIN_VARIANT (new_type) = t;
> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
> index 48634d2..8d7d9b9 100644
> --- a/gcc/ipa-prop.h
> +++ b/gcc/ipa-prop.h
> @@ -634,9 +634,10 @@ struct ipa_parm_adjustment
>       arguments.  */
>    tree alias_ptr_type;
>  
> -  /* The new declaration when creating/replacing a parameter.  Created by
> -     ipa_modify_formal_parameters, useful for functions modifying the body
> -     accordingly. */
> +  /* The new declaration when creating/replacing a parameter.  Created
> +     by ipa_modify_formal_parameters, useful for functions modifying
> +     the body accordingly.  For brand new arguments, this is the newly
> +     created argument.  */
>    tree reduction;

We should eventually rename this to new_decl or something, given that
this is not an SRA thing any more.  But that can be done later.

>  
>    /* New declaration of a substitute variable that we may use to replace all
> @@ -647,15 +648,36 @@ struct ipa_parm_adjustment
>       is NULL), this is going to be its nonlocalized vars value.  */
>    tree nonlocal_value;
>  
> +  /* If this is a brand new argument, this holds the prefix to be used
> +     for the DECL_NAME.  */
> +  const char *new_arg_prefix;
> +
>    /* Offset into the original parameter (for the cases when the new parameter
>       is a component of an original one).  */
>    HOST_WIDE_INT offset;
>  
> -  /* Zero based index of the original parameter this one is based on.  (ATM
> -     there is no way to insert a new parameter out of the blue because there is
> -     no need but if it arises the code can be easily exteded to do so.)  */
> +  /* Zero based index of the original parameter this one is based on.  */
>    int base_index;
>  
> +  /* If non-null, the parameter is a vector of `type' with this many
> +     elements.  */
> +  int simdlen;
> +
> +  /* This is a brand new parameter.
> +
> +     For new parameters, base_index must be >= the number of
> +     DECL_ARGUMENTS in the function.  That is, new arguments will be
> +     the last arguments in the adjusted function.
> +
> +     ?? Perhaps we could redesign ipa_modify_formal_parameters() to
> +     reorganize argument position, thus allowing inserting of brand
> +     new arguments anywhere, but there is no use for this now.

Where does this requirement come from?  At least at the moment I
cannot see why ipa_modify_formal_parameters wouldn't be able to
reorder parameters as it is?  What breaks if base_index of adjustments
for new parameters has zero or a nonsensical value?

> +
> +     Also, `type' should be set to the new type, `new_arg_prefix'
> +     should be set to the string prefix for the new DECL_NAME, and
> +     `reduction' will ultimately hold the newly created argument.  */
> +  unsigned new_param : 1;
> +
>    /* This new parameter is an unmodified parameter at index base_index. */
>    unsigned copy_param : 1;
>  
> @@ -697,5 +719,7 @@ void ipa_dump_param (FILE *, struct ipa_node_params *info, int i);
>  /* From tree-sra.c:  */
>  tree build_ref_for_offset (location_t, tree, HOST_WIDE_INT, tree,
>  			   gimple_stmt_iterator *, bool);
> +bool ipa_sra_modify_function_body (ipa_parm_adjustment_vec);
> +bool sra_ipa_modify_expr (tree *, bool, ipa_parm_adjustment_vec);
>  

Hm, if you can directly use these, I really think you should rename
them somehow so that their names do not contain SRA and move them to
ipa-prop.c.

Thanks for reviving this slightly moribund infrastructure and sorry
again for the delay,

Martin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-07 16:43 ` Martin Jambor
@ 2013-11-08 18:21   ` Aldy Hernandez
  2013-11-11 19:03     ` Martin Jambor
  0 siblings, 1 reply; 11+ messages in thread
From: Aldy Hernandez @ 2013-11-08 18:21 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Henderson, Richard Biener, Jan Hubicka,
	gcc-patches, Iyer, Balaji V

[-- Attachment #1: Type: text/plain, Size: 7052 bytes --]

On 11/07/13 09:09, Martin Jambor wrote:

> I am glad this is becoming a useful infrastructure rather than just a
> part of IPA-SRA.  Note that while ipa_combine_adjustments is not used
> from anywhere and thus probably buggy anyway, it should in theory be
> able to process new_param adjustments too.  Can you please at least
> put a "not implemented" assert there?  (The reason is that the plan

Done.

>> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
>> index c38ba82..faae080 100644
>> --- a/gcc/ipa-cp.c
>> +++ b/gcc/ipa-cp.c
>> @@ -446,6 +446,13 @@ determine_versionability (struct cgraph_node *node)
>>       reason = "not a tree_versionable_function";
>>     else if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
>>       reason = "insufficient body availability";
>> +  else if (node->has_simd_clones)
>> +    {
>> +      /* Ideally we should clone the SIMD clones themselves and create
>> +	 vector copies of them, so IPA-cp and SIMD clones can happily
>> +	 coexist, but that may not be worth the effort.  */
>> +      reason = "function has SIMD clones";
>> +    }
>
> Lets hope we will eventually fix this in some followup :-)

Sure, but to be honest it's not super high on my priority list, perhaps 
once the basic functionality is in trunk.

>> +  tree new_arg_types = NULL;
>> +  for (int i = 0; i < len; i++)
>>       {
>>         struct ipa_parm_adjustment *adj;
>>         gcc_assert (link);
>>
>>         adj = &adjustments[i];
>> -      parm = oparms[adj->base_index];
>> +      tree parm;
>> +      if (adj->new_param)
>
> I don't know what I was thinking when I invented copy_param and
> remove_param as multiple flags rather than a single enum, I probably
> wasn't thinking at all.  I can change it myself as a followup if you
> have more pressing tasks now.  Meanwhile, can you gcc_checking_assert
> that at most one flag is set at appropriate places?

Not a problem, I can implement the enum changes since I'm already 
changing all this code.  Done.

>
>> +	parm = NULL;
>> +      else
>> +	parm = oparms[adj->base_index];
>>         adj->base = parm;
>
> I do not think it makes sense for new parameters to have a base which
> is basically the old decl.  Do you have any reasons for not setting it
> to NULL?

In this particular case, adj->base is already being set to NULL because 
parm=NULL for adj->op.  The code now reads:


>       if (adj->op == IPA_PARM_OP_NEW)
> 	parm = NULL;
>       else
> 	parm = oparms[adj->base_index];
>       adj->base = parm;

Am I missing something?  Base is already been set to NULL for new 
parameters.

>
>>
>>         if (adj->copy_param)
>> @@ -3417,8 +3418,18 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>>   	  tree new_parm;
>>   	  tree ptype;
>>
>
>> -	  if (adj->by_ref)
>> -	    ptype = build_pointer_type (adj->type);
>
> Please add gcc_checking_assert (!adj->by_ref || adj->simdlen == 0)
> here...

Done.

>> +	  const char *prefix
>> +	    = adj->new_param ? adj->new_arg_prefix : synth_parm_prefix;
>
> Can we perhaps get rid of synth_parm_prefix then and just have
> adj->new_arg_prefix?  It's not particularly important but this is
> weird.

Done.

>
>
>> +	  DECL_NAME (new_parm) = create_tmp_var_name (prefix);
>>   	  DECL_ARTIFICIAL (new_parm) = 1;
>>   	  DECL_ARG_TYPE (new_parm) = ptype;
>>   	  DECL_CONTEXT (new_parm) = fndecl;
>> @@ -3436,17 +3448,20 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>>   	  DECL_IGNORED_P (new_parm) = 1;
>>   	  layout_decl (new_parm, 0);
>>
>> -	  adj->base = parm;
>> +	  if (adj->new_param)
>> +	    adj->base = new_parm;
>
> Again, shouldn't this be NULL?

This one, yes :).  Done.

>> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
>> index 48634d2..8d7d9b9 100644
>> --- a/gcc/ipa-prop.h
>> +++ b/gcc/ipa-prop.h
>> @@ -634,9 +634,10 @@ struct ipa_parm_adjustment
>>        arguments.  */
>>     tree alias_ptr_type;
>>
>> -  /* The new declaration when creating/replacing a parameter.  Created by
>> -     ipa_modify_formal_parameters, useful for functions modifying the body
>> -     accordingly. */
>> +  /* The new declaration when creating/replacing a parameter.  Created
>> +     by ipa_modify_formal_parameters, useful for functions modifying
>> +     the body accordingly.  For brand new arguments, this is the newly
>> +     created argument.  */
>>     tree reduction;
>
> We should eventually rename this to new_decl or something, given that
> this is not an SRA thing any more.  But that can be done later.

Done.

>
>>
>>     /* New declaration of a substitute variable that we may use to replace all
>> @@ -647,15 +648,36 @@ struct ipa_parm_adjustment
>>        is NULL), this is going to be its nonlocalized vars value.  */
>>     tree nonlocal_value;
>>
>> +  /* If this is a brand new argument, this holds the prefix to be used
>> +     for the DECL_NAME.  */
>> +  const char *new_arg_prefix;
>> +
>>     /* Offset into the original parameter (for the cases when the new parameter
>>        is a component of an original one).  */
>>     HOST_WIDE_INT offset;
>>
>> -  /* Zero based index of the original parameter this one is based on.  (ATM
>> -     there is no way to insert a new parameter out of the blue because there is
>> -     no need but if it arises the code can be easily exteded to do so.)  */
>> +  /* Zero based index of the original parameter this one is based on.  */
>>     int base_index;
>>
>> +  /* If non-null, the parameter is a vector of `type' with this many
>> +     elements.  */
>> +  int simdlen;
>> +
>> +  /* This is a brand new parameter.
>> +
>> +     For new parameters, base_index must be >= the number of
>> +     DECL_ARGUMENTS in the function.  That is, new arguments will be
>> +     the last arguments in the adjusted function.
>> +
>> +     ?? Perhaps we could redesign ipa_modify_formal_parameters() to
>> +     reorganize argument position, thus allowing inserting of brand
>> +     new arguments anywhere, but there is no use for this now.
>
> Where does this requirement come from?  At least at the moment I
> cannot see why ipa_modify_formal_parameters wouldn't be able to
> reorder parameters as it is?  What breaks if base_index of adjustments
> for new parameters has zero or a nonsensical value?

 From my very vivid imagination.  Forget I said that.  I hadn't looked 
into it at all; I just assumed.  I have removed the ??? comment.

> Hm, if you can directly use these, I really think you should rename
> them somehow so that their names do not contain SRA and move them to
> ipa-prop.c.

I'd like to do this as a followup so you can see all my changes before I 
move things en masse.

>
> Thanks for reviving this slightly moribund infrastructure and sorry
> again for the delay,

Not a problem.  Thanks for the review.

Would you be so kind as to review these changes to make sure I didn't 
miss anything?

The patch is lightly tested as my current box is pathetically slow today 
but so far so good with gomp.exp tests.

OK for gomp-4_0-branch pending tests?

Aldy


[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 17958 bytes --]

commit c4daa339084cb2d67b49fa2c33245ea09057752e
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Fri Nov 8 09:29:49 2013 -0700

    	* ipa-prop.c (ipa_modify_formal_parameters): Remove
    	synth_parm_prefix argument.
    	Use operator enum instead of bit fields.
    	Add assert for properly handling vector of references.
    	(ipa_modify_call_arguments): Use operator enum instead of bit
    	fields.
    	(ipa_combine_adjustments): Same.
    	Assert that IPA_PARM_OP_NEW is not used.
    	(ipa_dump_param_adjustments): Rename reduction to new_decl.
    	Use operator enum instead of bit fields.
    	* ipa-prop.h (enum ipa_parm_op): New.
    	(struct ipa_parm_adjustment): New field op.
    	Rename reduction to new_decl.
    	Rename new_arg_prefix to arg_prefix.
    	Remove new_param, remove_param, copy_param.
    	(ipa_modify_formal_parameters): Remove argument.
    	* omp-low.c (simd_clone_adjust_argument_types):	Set arg_prefix.
    	Use operator enum instead of bit fields.
    	(simd_clone_adjust_argument_types): Use operator enum instead of
    	bit fields.
    	Remove last argument to ipa_modify_formal_parameters call.
    	(simd_clone_init_simd_arrays): Use operator enum.
    	(ipa_simd_modify_stmt_ops): Rename reduction to new_decl.
    	(ipa_simd_modify_function_body): Same.
    	* tree-sra.c (turn_representatives_into_adjustments): Use operator
    	enum.  Set arg_prefix.
    	(get_adjustment_for_base): Use operator enum.
    	(sra_ipa_get_adjustment_candidate): Same.
    	(sra_ipa_modify_expr): Rename reduction to new_decl.
    	(sra_ipa_reset_debug_stmts): Use operator enum.
    	(modify_function): Do not pass prefix argument.

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 94a47cb..2a1f1e8 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -3362,12 +3362,8 @@ get_vector_of_formal_parm_types (tree fntype)
    base_index field.  */
 
 void
-ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
-			      const char *synth_parm_prefix)
+ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments)
 {
-  if (!synth_parm_prefix)
-    synth_parm_prefix = "SYNTH";
-
   vec<tree> oparms = ipa_get_vector_of_formal_parms (fndecl);
   tree orig_type = TREE_TYPE (fndecl);
   tree old_arg_types = TYPE_ARG_TYPES (orig_type);
@@ -3403,13 +3399,13 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
 
       adj = &adjustments[i];
       tree parm;
-      if (adj->new_param)
+      if (adj->op == IPA_PARM_OP_NEW)
 	parm = NULL;
       else
 	parm = oparms[adj->base_index];
       adj->base = parm;
 
-      if (adj->copy_param)
+      if (adj->op == IPA_PARM_OP_COPY)
 	{
 	  if (care_for_types)
 	    new_arg_types = tree_cons (NULL_TREE, otypes[adj->base_index],
@@ -3417,11 +3413,12 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
 	  *link = parm;
 	  link = &DECL_CHAIN (parm);
 	}
-      else if (!adj->remove_param)
+      else if (adj->op != IPA_PARM_OP_REMOVE)
 	{
 	  tree new_parm;
 	  tree ptype;
 
+	  gcc_checking_assert (!adj->by_ref || adj->simdlen);
 	  if (adj->simdlen)
 	    {
 	      /* If we have a non-null simdlen but by_ref is true, we
@@ -3442,8 +3439,7 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
 
 	  new_parm = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL_TREE,
 				 ptype);
-	  const char *prefix
-	    = adj->new_param ? adj->new_arg_prefix : synth_parm_prefix;
+	  const char *prefix = adj->arg_prefix ? adj->arg_prefix : "SYNTH";
 	  DECL_NAME (new_parm) = create_tmp_var_name (prefix);
 	  DECL_ARTIFICIAL (new_parm) = 1;
 	  DECL_ARG_TYPE (new_parm) = ptype;
@@ -3452,11 +3448,11 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
 	  DECL_IGNORED_P (new_parm) = 1;
 	  layout_decl (new_parm, 0);
 
-	  if (adj->new_param)
-	    adj->base = new_parm;
+	  if (adj->op == IPA_PARM_OP_NEW)
+	    adj->base = NULL;
 	  else
 	    adj->base = parm;
-	  adj->reduction = new_parm;
+	  adj->new_decl = new_parm;
 
 	  *link = new_parm;
 	  link = &DECL_CHAIN (new_parm);
@@ -3485,7 +3481,7 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
      instead.  */
   tree new_type = NULL;
   if (TREE_CODE (orig_type) != METHOD_TYPE
-       || (adjustments[0].copy_param
+       || (adjustments[0].op == IPA_PARM_OP_COPY
 	  && adjustments[0].base_index == 0))
     {
       new_type = build_distinct_type_copy (orig_type);
@@ -3558,13 +3554,13 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt,
 
       adj = &adjustments[i];
 
-      if (adj->copy_param)
+      if (adj->op == IPA_PARM_OP_COPY)
 	{
 	  tree arg = gimple_call_arg (stmt, adj->base_index);
 
 	  vargs.quick_push (arg);
 	}
-      else if (!adj->remove_param)
+      else if (adj->op != IPA_PARM_OP_REMOVE)
 	{
 	  tree expr, base, off;
 	  location_t loc;
@@ -3683,7 +3679,7 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt,
 					   NULL, true, GSI_SAME_STMT);
 	  vargs.quick_push (expr);
 	}
-      if (!adj->copy_param && MAY_HAVE_DEBUG_STMTS)
+      if (adj->op != IPA_PARM_OP_COPY && MAY_HAVE_DEBUG_STMTS)
 	{
 	  unsigned int ix;
 	  tree ddecl = NULL_TREE, origin = DECL_ORIGIN (adj->base), arg;
@@ -3803,10 +3799,14 @@ ipa_combine_adjustments (ipa_parm_adjustment_vec inner,
       struct ipa_parm_adjustment *n;
       n = &inner[i];
 
-      if (n->remove_param)
+      if (n->op == IPA_PARM_OP_REMOVE)
 	removals++;
       else
-	tmp.quick_push (*n);
+	{
+	  /* FIXME: Handling of new arguments are not implemented yet.  */
+	  gcc_assert (n->op != IPA_PARM_OP_NEW);
+	  tmp.quick_push (*n);
+	}
     }
 
   adjustments.create (outlen + removals);
@@ -3817,27 +3817,32 @@ ipa_combine_adjustments (ipa_parm_adjustment_vec inner,
       struct ipa_parm_adjustment *in = &tmp[out->base_index];
 
       memset (&r, 0, sizeof (r));
-      gcc_assert (!in->remove_param);
-      if (out->remove_param)
+      gcc_assert (in->op != IPA_PARM_OP_REMOVE);
+      if (out->op == IPA_PARM_OP_REMOVE)
 	{
 	  if (!index_in_adjustments_multiple_times_p (in->base_index, tmp))
 	    {
-	      r.remove_param = true;
+	      r.op = IPA_PARM_OP_REMOVE;
 	      adjustments.quick_push (r);
 	    }
 	  continue;
 	}
+      else
+	{
+	  /* FIXME: Handling of new arguments are not implemented yet.  */
+	  gcc_assert (out->op != IPA_PARM_OP_NEW);
+	}
 
       r.base_index = in->base_index;
       r.type = out->type;
 
       /* FIXME:  Create nonlocal value too.  */
 
-      if (in->copy_param && out->copy_param)
-	r.copy_param = true;
-      else if (in->copy_param)
+      if (in->op == IPA_PARM_OP_COPY && out->op == IPA_PARM_OP_COPY)
+	r.op = IPA_PARM_OP_COPY;
+      else if (in->op == IPA_PARM_OP_COPY)
 	r.offset = out->offset;
-      else if (out->copy_param)
+      else if (out->op == IPA_PARM_OP_COPY)
 	r.offset = in->offset;
       else
 	r.offset = in->offset + out->offset;
@@ -3848,7 +3853,7 @@ ipa_combine_adjustments (ipa_parm_adjustment_vec inner,
     {
       struct ipa_parm_adjustment *n = &inner[i];
 
-      if (n->remove_param)
+      if (n->op == IPA_PARM_OP_REMOVE)
 	adjustments.quick_push (*n);
     }
 
@@ -3885,10 +3890,10 @@ ipa_dump_param_adjustments (FILE *file, ipa_parm_adjustment_vec adjustments,
 	  fprintf (file, ", base: ");
 	  print_generic_expr (file, adj->base, 0);
 	}
-      if (adj->reduction)
+      if (adj->new_decl)
 	{
-	  fprintf (file, ", reduction: ");
-	  print_generic_expr (file, adj->reduction, 0);
+	  fprintf (file, ", new_decl: ");
+	  print_generic_expr (file, adj->new_decl, 0);
 	}
       if (adj->new_ssa_base)
 	{
@@ -3896,9 +3901,9 @@ ipa_dump_param_adjustments (FILE *file, ipa_parm_adjustment_vec adjustments,
 	  print_generic_expr (file, adj->new_ssa_base, 0);
 	}
 
-      if (adj->copy_param)
+      if (adj->op == IPA_PARM_OP_COPY)
 	fprintf (file, ", copy_param");
-      else if (adj->remove_param)
+      else if (adj->op == IPA_PARM_OP_REMOVE)
 	fprintf (file, ", remove_param");
       else
 	fprintf (file, ", offset %li", (long) adj->offset);
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 6aebf8d..e1e8622 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -609,6 +609,31 @@ extern alloc_pool ipcp_values_pool;
 extern alloc_pool ipcp_sources_pool;
 extern alloc_pool ipcp_agg_lattice_pool;
 
+/* Operation to be performed for the parameter in ipa_parm_adjustment
+   below.  */
+enum ipa_parm_op {
+  IPA_PARM_OP_NONE,
+
+  /* This describes a brand new parameter.
+
+     For new parameters, base_index must be >= the number of
+     DECL_ARGUMENTS in the function.  That is, new arguments will be
+     the last arguments in the adjusted function.
+
+     Also, `type' should be set to the new type, `arg_prefix'
+     should be set to the string prefix for the new DECL_NAME, and
+     `new_decl' will ultimately hold the newly created argument.  */
+  IPA_PARM_OP_NEW,
+
+  /* This new parameter is an unmodified parameter at index base_index. */
+  IPA_PARM_OP_COPY,
+
+  /* This adjustment describes a parameter that is about to be removed
+     completely.  Most users will probably need to book keep those so that they
+     don't leave behinfd any non default def ssa names belonging to them.  */
+  IPA_PARM_OP_REMOVE
+};
+
 /* Structure to describe transformations of formal parameters and actual
    arguments.  Each instance describes one new parameter and they are meant to
    be stored in a vector.  Additionally, most users will probably want to store
@@ -636,7 +661,7 @@ struct ipa_parm_adjustment
      by ipa_modify_formal_parameters, useful for functions modifying
      the body accordingly.  For brand new arguments, this is the newly
      created argument.  */
-  tree reduction;
+  tree new_decl;
 
   /* New declaration of a substitute variable that we may use to replace all
      non-default-def ssa names when a parm decl is going away.  */
@@ -646,9 +671,8 @@ struct ipa_parm_adjustment
      is NULL), this is going to be its nonlocalized vars value.  */
   tree nonlocal_value;
 
-  /* If this is a brand new argument, this holds the prefix to be used
-     for the DECL_NAME.  */
-  const char *new_arg_prefix;
+  /* This holds the prefix to be used for the new DECL_NAME.  */
+  const char *arg_prefix;
 
   /* Offset into the original parameter (for the cases when the new parameter
      is a component of an original one).  */
@@ -661,28 +685,9 @@ struct ipa_parm_adjustment
      elements.  */
   int simdlen;
 
-  /* This is a brand new parameter.
-
-     For new parameters, base_index must be >= the number of
-     DECL_ARGUMENTS in the function.  That is, new arguments will be
-     the last arguments in the adjusted function.
-
-     ?? Perhaps we could redesign ipa_modify_formal_parameters() to
-     reorganize argument position, thus allowing inserting of brand
-     new arguments anywhere, but there is no use for this now.
-
-     Also, `type' should be set to the new type, `new_arg_prefix'
-     should be set to the string prefix for the new DECL_NAME, and
-     `reduction' will ultimately hold the newly created argument.  */
-  unsigned new_param : 1;
-
-  /* This new parameter is an unmodified parameter at index base_index. */
-  unsigned copy_param : 1;
-
-  /* This adjustment describes a parameter that is about to be removed
-     completely.  Most users will probably need to book keep those so that they
-     don't leave behinfd any non default def ssa names belonging to them.  */
-  unsigned remove_param : 1;
+  /* Whether this parameter is a new parameter, a copy of an old one,
+     or one about to be removed.  */
+  enum ipa_parm_op op;
 
   /* The parameter is to be passed by reference.  */
   unsigned by_ref : 1;
@@ -693,8 +698,7 @@ typedef struct ipa_parm_adjustment ipa_parm_adjustment_t;
 typedef vec<ipa_parm_adjustment_t> ipa_parm_adjustment_vec;
 
 vec<tree> ipa_get_vector_of_formal_parms (tree fndecl);
-void ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec,
-				   const char *);
+void ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec);
 void ipa_modify_call_arguments (struct cgraph_edge *, gimple,
 				ipa_parm_adjustment_vec);
 ipa_parm_adjustment_vec ipa_combine_adjustments (ipa_parm_adjustment_vec,
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 6845ee6..51cda58 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -11635,7 +11635,7 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
       if (node->simdclone->args[i].arg_type != SIMD_CLONE_ARG_TYPE_VECTOR)
 	{
 	  /* No adjustment necessary for scalar arguments.  */
-	  adj.copy_param = 1;
+	  adj.op = IPA_PARM_OP_COPY;
 	}
       else
 	{
@@ -11649,6 +11649,7 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
 				     TREE_TYPE (parm),
 				     node->simdclone->simdlen);
 	}
+      adj.arg_prefix = "simd";
       adjustments.quick_push (adj);
     }
 
@@ -11657,8 +11658,8 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
       struct ipa_parm_adjustment adj;
 
       memset (&adj, 0, sizeof (adj));
-      adj.new_param = 1;
-      adj.new_arg_prefix = "mask";
+      adj.op = IPA_PARM_OP_NEW;
+      adj.arg_prefix = "mask";
       adj.base_index = i;
       adj.type
 	= build_vector_type (integer_type_node, node->simdclone->simdlen);
@@ -11674,7 +11675,7 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
 	= create_tmp_simd_array ("mask", integer_type_node, sc->simdlen);
     }
 
-  ipa_modify_formal_parameters (node->decl, adjustments, "simd");
+  ipa_modify_formal_parameters (node->decl, adjustments);
   return adjustments;
 }
 
@@ -11693,7 +11694,7 @@ simd_clone_init_simd_arrays (struct cgraph_node *node,
        arg;
        arg = DECL_CHAIN (arg), i++)
     {
-      if (adjustments[i].copy_param)
+      if (adjustments[i].op == IPA_PARM_OP_COPY)
 	continue;
 
       node->simdclone->args[i].vector_arg = arg;
@@ -11749,7 +11750,7 @@ ipa_simd_modify_stmt_ops (tree *tp, int *walk_subtrees, void *data)
   gimple_stmt_iterator gsi = gsi_for_stmt (info->stmt);
   if (wi->is_lhs)
     {
-      stmt = gimple_build_assign (unshare_expr (cand->reduction), repl);
+      stmt = gimple_build_assign (unshare_expr (cand->new_decl), repl);
       gsi_insert_after (&gsi, stmt, GSI_SAME_STMT);
       SSA_NAME_DEF_STMT (repl) = info->stmt;
     }
@@ -11759,7 +11760,7 @@ ipa_simd_modify_stmt_ops (tree *tp, int *walk_subtrees, void *data)
 	 wi->val_only=true, but we may have `*var' which will get
 	 replaced into `*var_array[iter]' and will likely be something
 	 not gimple.  */
-      stmt = gimple_build_assign (repl, unshare_expr (cand->reduction));
+      stmt = gimple_build_assign (repl, unshare_expr (cand->new_decl));
       gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
     }
 
@@ -11802,7 +11803,7 @@ ipa_simd_modify_function_body (struct cgraph_node *node,
 	continue;
 
       tree basetype = TREE_TYPE (node->simdclone->args[i].orig_arg);
-      adjustments[i].reduction
+      adjustments[i].new_decl
 	= build4 (ARRAY_REF,
 		  basetype,
 		  node->simdclone->args[i].simd_array,
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 36994f7..2f19899 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -4271,9 +4271,10 @@ turn_representatives_into_adjustments (vec<access_p> representatives,
 	  adj.base_index = get_param_index (parm, parms);
 	  adj.base = parm;
 	  if (!repr)
-	    adj.copy_param = 1;
+	    adj.op = IPA_PARM_OP_COPY;
 	  else
-	    adj.remove_param = 1;
+	    adj.op = IPA_PARM_OP_REMOVE;
+	  adj.arg_prefix = "ISRA";
 	  adjustments.quick_push (adj);
 	}
       else
@@ -4293,6 +4294,7 @@ turn_representatives_into_adjustments (vec<access_p> representatives,
 	      adj.by_ref = (POINTER_TYPE_P (TREE_TYPE (repr->base))
 			    && (repr->grp_maybe_modified
 				|| repr->grp_not_necessarilly_dereferenced));
+	      adj.arg_prefix = "ISRA";
 	      adjustments.quick_push (adj);
 	    }
 	}
@@ -4423,7 +4425,7 @@ get_adjustment_for_base (ipa_parm_adjustment_vec adjustments, tree base)
       struct ipa_parm_adjustment *adj;
 
       adj = &adjustments[i];
-      if (!adj->copy_param && adj->base == base)
+      if (adj->op != IPA_PARM_OP_COPY && adj->base == base)
 	return adj;
     }
 
@@ -4534,14 +4536,14 @@ sra_ipa_get_adjustment_candidate (tree *&expr, bool *convert,
       struct ipa_parm_adjustment *adj = &adjustments[i];
 
       if (adj->base == base
-	  && (adj->offset == offset || adj->remove_param))
+	  && (adj->offset == offset || adj->op == IPA_PARM_OP_REMOVE))
 	{
 	  cand = adj;
 	  break;
 	}
     }
 
-  if (!cand || cand->copy_param || cand->remove_param)
+  if (!cand || cand->op == IPA_PARM_OP_COPY || cand->op == IPA_PARM_OP_REMOVE)
     return NULL;
   return cand;
 }
@@ -4564,9 +4566,9 @@ sra_ipa_modify_expr (tree *expr, bool convert,
 
   tree src;
   if (cand->by_ref)
-    src = build_simple_mem_ref (cand->reduction);
+    src = build_simple_mem_ref (cand->new_decl);
   else
-    src = cand->reduction;
+    src = cand->new_decl;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -4760,7 +4762,7 @@ sra_ipa_reset_debug_stmts (ipa_parm_adjustment_vec adjustments)
       use_operand_p use_p;
 
       adj = &adjustments[i];
-      if (adj->copy_param || !is_gimple_reg (adj->base))
+      if (adj->op == IPA_PARM_OP_COPY || !is_gimple_reg (adj->base))
 	continue;
       name = ssa_default_def (cfun, adj->base);
       vexpr = NULL;
@@ -4943,7 +4945,7 @@ modify_function (struct cgraph_node *node, ipa_parm_adjustment_vec adjustments)
   redirect_callers.release ();
 
   push_cfun (DECL_STRUCT_FUNCTION (new_node->decl));
-  ipa_modify_formal_parameters (current_function_decl, adjustments, "ISRA");
+  ipa_modify_formal_parameters (current_function_decl, adjustments);
   cfg_changed = ipa_sra_modify_function_body (adjustments);
   sra_ipa_reset_debug_stmts (adjustments);
   convert_callers (new_node, node->decl, adjustments);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-08 18:21   ` Aldy Hernandez
@ 2013-11-11 19:03     ` Martin Jambor
  2013-11-11 19:25       ` Jakub Jelinek
  2013-11-12 17:52       ` Aldy Hernandez
  0 siblings, 2 replies; 11+ messages in thread
From: Martin Jambor @ 2013-11-11 19:03 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: Jakub Jelinek, Richard Henderson, Richard Biener, Jan Hubicka,
	gcc-patches, Iyer, Balaji V

Hi,

thanks for the followup.  I like it, I only don't understand...

On Fri, Nov 08, 2013 at 10:48:43AM -0700, Aldy Hernandez wrote:
> On 11/07/13 09:09, Martin Jambor wrote:
> 

<...>

> --- a/gcc/ipa-prop.h
> +++ b/gcc/ipa-prop.h
> @@ -609,6 +609,31 @@ extern alloc_pool ipcp_values_pool;
>  extern alloc_pool ipcp_sources_pool;
>  extern alloc_pool ipcp_agg_lattice_pool;
>  
> +/* Operation to be performed for the parameter in ipa_parm_adjustment
> +   below.  */
> +enum ipa_parm_op {
> +  IPA_PARM_OP_NONE,
> +
> +  /* This describes a brand new parameter.
> +
> +     For new parameters, base_index must be >= the number of
> +     DECL_ARGUMENTS in the function.  That is, new arguments will be
> +     the last arguments in the adjusted function.

...where this requirement comes from.  I would think that base_index
would be completely ignored for the new parameters, is it not?

Thanks,

Martin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-11 19:03     ` Martin Jambor
@ 2013-11-11 19:25       ` Jakub Jelinek
  2013-11-12 17:52       ` Aldy Hernandez
  1 sibling, 0 replies; 11+ messages in thread
From: Jakub Jelinek @ 2013-11-11 19:25 UTC (permalink / raw)
  To: Aldy Hernandez, Richard Henderson, Richard Biener, Jan Hubicka,
	gcc-patches, Iyer, Balaji V

On Mon, Nov 11, 2013 at 06:57:39PM +0100, Martin Jambor wrote:
> > --- a/gcc/ipa-prop.h
> > +++ b/gcc/ipa-prop.h
> > @@ -609,6 +609,31 @@ extern alloc_pool ipcp_values_pool;
> >  extern alloc_pool ipcp_sources_pool;
> >  extern alloc_pool ipcp_agg_lattice_pool;
> >  
> > +/* Operation to be performed for the parameter in ipa_parm_adjustment
> > +   below.  */
> > +enum ipa_parm_op {
> > +  IPA_PARM_OP_NONE,
> > +
> > +  /* This describes a brand new parameter.
> > +
> > +     For new parameters, base_index must be >= the number of
> > +     DECL_ARGUMENTS in the function.  That is, new arguments will be
> > +     the last arguments in the adjusted function.
> 
> ...where this requirement comes from.  I would think that base_index
> would be completely ignored for the new parameters, is it not?

Ouch, I'll actually need to insert new parameters in the middle, because say
for SSE2
#pragma omp declare simd simdlen (4) inbranch
int foo (int x, long long y, int z);
is passed as
vector(4) int foo (vector(4) int x, vector(2) long long y,
		   vector(2) long long y2nd, vector(4) int z,
		   vector(4) int mask);
and thus I need to modify the first and second argument, then
insert a new argument after second, then modify third argument and add
another argument at the end.

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFC: simd enabled functions (omp declare simd / elementals)
  2013-11-11 19:03     ` Martin Jambor
  2013-11-11 19:25       ` Jakub Jelinek
@ 2013-11-12 17:52       ` Aldy Hernandez
  1 sibling, 0 replies; 11+ messages in thread
From: Aldy Hernandez @ 2013-11-12 17:52 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Henderson, Richard Biener, Jan Hubicka,
	gcc-patches, Iyer, Balaji V

[-- Attachment #1: Type: text/plain, Size: 799 bytes --]


>> +/* Operation to be performed for the parameter in ipa_parm_adjustment
>> +   below.  */
>> +enum ipa_parm_op {
>> +  IPA_PARM_OP_NONE,
>> +
>> +  /* This describes a brand new parameter.
>> +
>> +     For new parameters, base_index must be >= the number of
>> +     DECL_ARGUMENTS in the function.  That is, new arguments will be
>> +     the last arguments in the adjusted function.
>
> ...where this requirement comes from.  I would think that base_index
> would be completely ignored for the new parameters, is it not?

Well, whadayaknow... base_index is indeed ignored, and a cursory look at 
ipa_modify_formal_parameters() suggests that you may be able to insert 
arguments out of order (untested).  Jakub, you may be in luck :).

Committing the attached fix to the branch.

Thanks again.

[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 893 bytes --]

commit cc9c895aebe4ba1c017720fe5a43599b53696236
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Tue Nov 12 09:42:30 2013 -0700

    	* ipa-prop.h (enum ipa_parm_op): Adjust comment to IPA_PARM_OP_NEW
    	entry.

diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 0621a13..a2d8797 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -616,11 +616,7 @@ enum ipa_parm_op {
 
   /* This describes a brand new parameter.
 
-     For new parameters, base_index must be >= the number of
-     DECL_ARGUMENTS in the function.  That is, new arguments will be
-     the last arguments in the adjusted function.
-
-     Also, `type' should be set to the new type, `arg_prefix'
+     The field `type' should be set to the new type, `arg_prefix'
      should be set to the string prefix for the new DECL_NAME, and
      `new_decl' will ultimately hold the newly created argument.  */
   IPA_PARM_OP_NEW,

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-11-12 16:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-01  3:05 RFC: simd enabled functions (omp declare simd / elementals) Aldy Hernandez
2013-11-01 10:57 ` Jakub Jelinek
2013-11-01 12:35 ` Jakub Jelinek
2013-11-01 16:51   ` Jakub Jelinek
2013-11-04 10:37 ` Richard Biener
2013-11-04 10:58   ` Jakub Jelinek
2013-11-07 16:43 ` Martin Jambor
2013-11-08 18:21   ` Aldy Hernandez
2013-11-11 19:03     ` Martin Jambor
2013-11-11 19:25       ` Jakub Jelinek
2013-11-12 17:52       ` Aldy Hernandez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).