public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823)
@ 2014-04-18 15:00 Jakub Jelinek
  2014-04-22  8:15 ` Richard Biener
  2014-05-13  8:48 ` Rainer Orth
  0 siblings, 2 replies; 8+ messages in thread
From: Jakub Jelinek @ 2014-04-18 15:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi!

This patch fixes the adjustments performed by ipa_simd_modify_function_body
on omp declare simd clones.
Previously we've been trying to replace typically SSA_NAMEs with underlying
PARM_DECLs of the to be replaced arguments with loads/stores from/to
array refs (that will be hopefully vectorized) right around the referencing
stmt, but:
1) this can't really work well if there is any life range overlap in SSA_NAMEs
   with the same underlying PARM_DECL
2) PHIs weren't handled at all (neither PHI arguments, nor lhs of the PHIs)
3) for addressable PARM_DECLs the code pretty much assumed the same thing
   can be done too

This patch instead adjusts all SSA_NAMEs with SSA_NAME_VAR equal to the to
be replaced PARM_DECLs to a new underlying VAR_DECL, only changes the
(D) SSA_NAME to a load done at the start of the entry block, and for
addressable PARM_DECLs adjusts them such that they don't have to be
regimplified (as we replace say address of a PARM_DECL which is a
gimple_min_invariant with array ref with variable index which is not
gimple_min_invariant, we need to force the addresses into SSA_NAMEs).

The tree-dfa.c fix is what I've discovered while writing the patch,
if htab_find_slot_with_hash (..., NO_INSERT) fails to find something
in the hash table (most likely not actually needed by the patch, discovered
that just because the patch was buggy initially), it returns NULL rather
than address of some slot which will contain NULL.

Bootstrapped/regtested on x86_64-linux and i686-linux.

Richard, does this look reasonable?

2014-04-18  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/60823
	* omp-low.c (ipa_simd_modify_function_body): Go through
	all SSA_NAMEs and for those refering to vector arguments
	which are going to be replaced adjust SSA_NAME_VAR and,
	if it is a default definition, change it into a non-default
	definition assigned at the beginning of function from new_decl.
	(ipa_simd_modify_stmt_ops): Rewritten.
	* tree-dfa.c (set_ssa_default_def): When removing default def,
	check for NULL loc instead of NULL *loc.

	* c-c++-common/gomp/pr60823-1.c: New test.
	* c-c++-common/gomp/pr60823-2.c: New test.
	* c-c++-common/gomp/pr60823-3.c: New test.

--- gcc/omp-low.c.jj	2014-04-17 14:48:59.076025713 +0200
+++ gcc/omp-low.c	2014-04-18 12:00:16.666701773 +0200
@@ -11281,45 +11281,53 @@ static tree
 ipa_simd_modify_stmt_ops (tree *tp, int *walk_subtrees, void *data)
 {
   struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
-  if (!SSA_VAR_P (*tp))
+  struct modify_stmt_info *info = (struct modify_stmt_info *) wi->info;
+  tree *orig_tp = tp;
+  if (TREE_CODE (*tp) == ADDR_EXPR)
+    tp = &TREE_OPERAND (*tp, 0);
+  struct ipa_parm_adjustment *cand = NULL;
+  if (TREE_CODE (*tp) == PARM_DECL)
+    cand = ipa_get_adjustment_candidate (&tp, NULL, info->adjustments, true);
+  else
     {
-      /* Make sure we treat subtrees as a RHS.  This makes sure that
-	 when examining the `*foo' in *foo=x, the `foo' get treated as
-	 a use properly.  */
-      wi->is_lhs = false;
-      wi->val_only = true;
       if (TYPE_P (*tp))
 	*walk_subtrees = 0;
-      return NULL_TREE;
-    }
-  struct modify_stmt_info *info = (struct modify_stmt_info *) wi->info;
-  struct ipa_parm_adjustment *cand
-    = ipa_get_adjustment_candidate (&tp, NULL, info->adjustments, true);
-  if (!cand)
-    return NULL_TREE;
-
-  tree t = *tp;
-  tree repl = make_ssa_name (TREE_TYPE (t), NULL);
-
-  gimple stmt;
-  gimple_stmt_iterator gsi = gsi_for_stmt (info->stmt);
-  if (wi->is_lhs)
-    {
-      stmt = gimple_build_assign (unshare_expr (cand->new_decl), repl);
-      gsi_insert_after (&gsi, stmt, GSI_SAME_STMT);
-      SSA_NAME_DEF_STMT (repl) = info->stmt;
     }
+
+  tree repl = NULL_TREE;
+  if (cand)
+    repl = unshare_expr (cand->new_decl);
   else
     {
-      /* You'd think we could skip the extra SSA variable when
-	 wi->val_only=true, but we may have `*var' which will get
-	 replaced into `*var_array[iter]' and will likely be something
-	 not gimple.  */
-      stmt = gimple_build_assign (repl, unshare_expr (cand->new_decl));
-      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+      if (tp != orig_tp)
+	{
+	  *walk_subtrees = 0;
+	  bool modified = info->modified;
+	  info->modified = false;
+	  walk_tree (tp, ipa_simd_modify_stmt_ops, wi, wi->pset);
+	  if (!info->modified)
+	    {
+	      info->modified = modified;
+	      return NULL_TREE;
+	    }
+	  info->modified = modified;
+	  repl = *tp;
+	}
+      else
+	return NULL_TREE;
     }
 
-  if (!useless_type_conversion_p (TREE_TYPE (*tp), TREE_TYPE (repl)))
+  if (tp != orig_tp)
+    {
+      repl = build_fold_addr_expr (repl);
+      gimple stmt
+	= gimple_build_assign (make_ssa_name (TREE_TYPE (repl), NULL), repl);
+      repl = gimple_assign_lhs (stmt);
+      gimple_stmt_iterator gsi = gsi_for_stmt (info->stmt);
+      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+      *orig_tp = repl;
+    }
+  else if (!useless_type_conversion_p (TREE_TYPE (*tp), TREE_TYPE (repl)))
     {
       tree vce = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (*tp), repl);
       *tp = vce;
@@ -11328,8 +11336,6 @@ ipa_simd_modify_stmt_ops (tree *tp, int
     *tp = repl;
 
   info->modified = true;
-  wi->is_lhs = false;
-  wi->val_only = true;
   return NULL_TREE;
 }
 
@@ -11348,7 +11354,7 @@ ipa_simd_modify_function_body (struct cg
 			       tree retval_array, tree iter)
 {
   basic_block bb;
-  unsigned int i, j;
+  unsigned int i, j, l;
 
   /* Re-use the adjustments array, but this time use it to replace
      every function argument use to an offset into the corresponding
@@ -11371,6 +11377,46 @@ ipa_simd_modify_function_body (struct cg
 	j += node->simdclone->simdlen / TYPE_VECTOR_SUBPARTS (vectype) - 1;
     }
 
+  l = adjustments.length ();
+  for (i = 1; i < num_ssa_names; i++)
+    {
+      tree name = ssa_name (i);
+      if (name
+	  && SSA_NAME_VAR (name)
+	  && TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL)
+	{
+	  for (j = 0; j < l; j++)
+	    if (SSA_NAME_VAR (name) == adjustments[j].base
+		&& adjustments[j].new_decl)
+	      {
+		tree base_var;
+		if (adjustments[j].new_ssa_base == NULL_TREE)
+		  {
+		    base_var
+		      = copy_var_decl (adjustments[j].base,
+				       DECL_NAME (adjustments[j].base),
+				       TREE_TYPE (adjustments[j].base));
+		    adjustments[j].new_ssa_base = base_var;
+		  }
+		else
+		  base_var = adjustments[j].new_ssa_base;
+		if (SSA_NAME_IS_DEFAULT_DEF (name))
+		  {
+		    bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+		    gimple_stmt_iterator gsi = gsi_after_labels (bb);
+		    tree new_decl = unshare_expr (adjustments[j].new_decl);
+		    set_ssa_default_def (cfun, adjustments[j].base, NULL_TREE);
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (name, base_var);
+		    SSA_NAME_IS_DEFAULT_DEF (name) = 0;
+		    gimple stmt = gimple_build_assign (name, new_decl);
+		    gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+		  }
+		else
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (name, base_var);
+	      }
+	}
+    }
+
   struct modify_stmt_info info;
   info.adjustments = adjustments;
 
--- gcc/tree-dfa.c.jj	2014-03-13 20:09:18.000000000 +0100
+++ gcc/tree-dfa.c	2014-04-18 11:52:41.043248454 +0200
@@ -343,7 +343,7 @@ set_ssa_default_def (struct function *fn
     {
       loc = htab_find_slot_with_hash (DEFAULT_DEFS (fn), &in,
 				      DECL_UID (var), NO_INSERT);
-      if (*loc)
+      if (loc)
 	{
 	  SSA_NAME_IS_DEFAULT_DEF (*(tree *)loc) = false;
 	  htab_clear_slot (DEFAULT_DEFS (fn), loc);
--- gcc/testsuite/c-c++-common/gomp/pr60823-1.c.jj	2014-04-17 15:35:52.253884292 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr60823-1.c	2014-04-17 15:35:52.253884292 +0200
@@ -0,0 +1,19 @@
+/* PR tree-optimization/60823 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp-simd" } */
+
+#pragma omp declare simd simdlen(4) notinbranch
+int
+foo (const double c1, const double c2)
+{
+  double z1 = c1, z2 = c2;
+  int res = 100, i;
+
+  for (i = 0; i < 100; i++)
+    {
+      res = (z1 * z1 + z2 * z2 > 4.0) ? (i < res ? i : res) : res;
+      z1 = c1 + z1 * z1 - z2 * z2;
+      z2 = c2 + 2.0 * z1 * z2;
+    }
+  return res;
+}
--- gcc/testsuite/c-c++-common/gomp/pr60823-2.c.jj	2014-04-17 15:35:52.254884210 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr60823-2.c	2014-04-17 15:35:52.253884292 +0200
@@ -0,0 +1,43 @@
+/* PR tree-optimization/60823 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fopenmp-simd" } */
+
+#pragma omp declare simd simdlen(4) notinbranch
+__attribute__((noinline)) int
+foo (double c1, double c2)
+{
+  double z1 = c1, z2 = c2;
+  int res = 100, i;
+
+  for (i = 0; i < 5; i++)
+    {
+      res = (z1 * z1 + z2 * z2 > 4.0) ? (i < res ? i : res) : res;
+      z1 = c1 + z1 * z1 - z2 * z2;
+      z2 = c2 + 2.0 * z1 * z2;
+      c1 += 0.5;
+      c2 += 0.5;
+    }
+  return res;
+}
+
+__attribute__((noinline, noclone)) void
+bar (double *x, double *y)
+{
+  asm volatile ("" : : "rm" (x), "rm" (y) : "memory");
+}
+
+int
+main ()
+{
+  int i;
+  double c[4] = { 0.0, 1.0, 0.0, 1.0 };
+  double d[4] = { 0.0, 1.0, 2.0, 0.0 };
+  int e[4];
+  bar (c, d);
+#pragma omp simd safelen(4)
+  for (i = 0; i < 4; i++)
+    e[i] = foo (c[i], d[i]);
+  if (e[0] != 3 || e[1] != 1 || e[2] != 1 || e[3] != 2)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/c-c++-common/gomp/pr60823-3.c.jj	2014-04-17 16:43:42.580699699 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr60823-3.c	2014-04-17 16:42:59.000000000 +0200
@@ -0,0 +1,32 @@
+/* PR tree-optimization/60823 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp-simd -fno-strict-aliasing" } */
+
+void bar (char *, double *);
+
+#if __SIZEOF_DOUBLE__ >= 4
+
+struct S { char c[sizeof (double)]; };
+void baz (struct S, struct S);
+union U { struct S s; double d; };
+
+#pragma omp declare simd simdlen(4) notinbranch
+__attribute__((noinline)) int
+foo (double c1, double c2)
+{
+  double *a = &c1;
+  char *b = (char *) &c1 + 2;
+
+  b[-2]++;
+  b[1]--;
+  *a++;
+  c2++;
+  bar ((char *) &c2 + 1, &c2);
+  c2 *= 3.0;
+  bar (b, a);
+  baz (((union U) { .d = c1 }).s, ((union U) { .d = c2 }).s);
+  baz (*(struct S *)&c1, *(struct S *)&c2);
+  return c1 + c2 + ((struct S *)&c1)->c[1];
+}
+
+#endif

	Jakub

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823)
  2014-04-18 15:00 [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823) Jakub Jelinek
@ 2014-04-22  8:15 ` Richard Biener
  2014-05-13  8:48 ` Rainer Orth
  1 sibling, 0 replies; 8+ messages in thread
From: Richard Biener @ 2014-04-22  8:15 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Fri, 18 Apr 2014, Jakub Jelinek wrote:

> Hi!
> 
> This patch fixes the adjustments performed by ipa_simd_modify_function_body
> on omp declare simd clones.
> Previously we've been trying to replace typically SSA_NAMEs with underlying
> PARM_DECLs of the to be replaced arguments with loads/stores from/to
> array refs (that will be hopefully vectorized) right around the referencing
> stmt, but:
> 1) this can't really work well if there is any life range overlap in SSA_NAMEs
>    with the same underlying PARM_DECL
> 2) PHIs weren't handled at all (neither PHI arguments, nor lhs of the PHIs)
> 3) for addressable PARM_DECLs the code pretty much assumed the same thing
>    can be done too
> 
> This patch instead adjusts all SSA_NAMEs with SSA_NAME_VAR equal to the to
> be replaced PARM_DECLs to a new underlying VAR_DECL, only changes the
> (D) SSA_NAME to a load done at the start of the entry block, and for
> addressable PARM_DECLs adjusts them such that they don't have to be
> regimplified (as we replace say address of a PARM_DECL which is a
> gimple_min_invariant with array ref with variable index which is not
> gimple_min_invariant, we need to force the addresses into SSA_NAMEs).
> 
> The tree-dfa.c fix is what I've discovered while writing the patch,
> if htab_find_slot_with_hash (..., NO_INSERT) fails to find something
> in the hash table (most likely not actually needed by the patch, discovered
> that just because the patch was buggy initially), it returns NULL rather
> than address of some slot which will contain NULL.

Probably doesn't matter in practice as we are clearing a default-def
only if it is one (and thus it is recorded).

> Bootstrapped/regtested on x86_64-linux and i686-linux.
> 
> Richard, does this look reasonable?

Yeah.

Thanks,
Richard.

> 2014-04-18  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/60823
> 	* omp-low.c (ipa_simd_modify_function_body): Go through
> 	all SSA_NAMEs and for those refering to vector arguments
> 	which are going to be replaced adjust SSA_NAME_VAR and,
> 	if it is a default definition, change it into a non-default
> 	definition assigned at the beginning of function from new_decl.
> 	(ipa_simd_modify_stmt_ops): Rewritten.
> 	* tree-dfa.c (set_ssa_default_def): When removing default def,
> 	check for NULL loc instead of NULL *loc.
> 
> 	* c-c++-common/gomp/pr60823-1.c: New test.
> 	* c-c++-common/gomp/pr60823-2.c: New test.
> 	* c-c++-common/gomp/pr60823-3.c: New test.
> 
> --- gcc/omp-low.c.jj	2014-04-17 14:48:59.076025713 +0200
> +++ gcc/omp-low.c	2014-04-18 12:00:16.666701773 +0200
> @@ -11281,45 +11281,53 @@ static tree
>  ipa_simd_modify_stmt_ops (tree *tp, int *walk_subtrees, void *data)
>  {
>    struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
> -  if (!SSA_VAR_P (*tp))
> +  struct modify_stmt_info *info = (struct modify_stmt_info *) wi->info;
> +  tree *orig_tp = tp;
> +  if (TREE_CODE (*tp) == ADDR_EXPR)
> +    tp = &TREE_OPERAND (*tp, 0);
> +  struct ipa_parm_adjustment *cand = NULL;
> +  if (TREE_CODE (*tp) == PARM_DECL)
> +    cand = ipa_get_adjustment_candidate (&tp, NULL, info->adjustments, true);
> +  else
>      {
> -      /* Make sure we treat subtrees as a RHS.  This makes sure that
> -	 when examining the `*foo' in *foo=x, the `foo' get treated as
> -	 a use properly.  */
> -      wi->is_lhs = false;
> -      wi->val_only = true;
>        if (TYPE_P (*tp))
>  	*walk_subtrees = 0;
> -      return NULL_TREE;
> -    }
> -  struct modify_stmt_info *info = (struct modify_stmt_info *) wi->info;
> -  struct ipa_parm_adjustment *cand
> -    = ipa_get_adjustment_candidate (&tp, NULL, info->adjustments, true);
> -  if (!cand)
> -    return NULL_TREE;
> -
> -  tree t = *tp;
> -  tree repl = make_ssa_name (TREE_TYPE (t), NULL);
> -
> -  gimple stmt;
> -  gimple_stmt_iterator gsi = gsi_for_stmt (info->stmt);
> -  if (wi->is_lhs)
> -    {
> -      stmt = gimple_build_assign (unshare_expr (cand->new_decl), repl);
> -      gsi_insert_after (&gsi, stmt, GSI_SAME_STMT);
> -      SSA_NAME_DEF_STMT (repl) = info->stmt;
>      }
> +
> +  tree repl = NULL_TREE;
> +  if (cand)
> +    repl = unshare_expr (cand->new_decl);
>    else
>      {
> -      /* You'd think we could skip the extra SSA variable when
> -	 wi->val_only=true, but we may have `*var' which will get
> -	 replaced into `*var_array[iter]' and will likely be something
> -	 not gimple.  */
> -      stmt = gimple_build_assign (repl, unshare_expr (cand->new_decl));
> -      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
> +      if (tp != orig_tp)
> +	{
> +	  *walk_subtrees = 0;
> +	  bool modified = info->modified;
> +	  info->modified = false;
> +	  walk_tree (tp, ipa_simd_modify_stmt_ops, wi, wi->pset);
> +	  if (!info->modified)
> +	    {
> +	      info->modified = modified;
> +	      return NULL_TREE;
> +	    }
> +	  info->modified = modified;
> +	  repl = *tp;
> +	}
> +      else
> +	return NULL_TREE;
>      }
>  
> -  if (!useless_type_conversion_p (TREE_TYPE (*tp), TREE_TYPE (repl)))
> +  if (tp != orig_tp)
> +    {
> +      repl = build_fold_addr_expr (repl);
> +      gimple stmt
> +	= gimple_build_assign (make_ssa_name (TREE_TYPE (repl), NULL), repl);
> +      repl = gimple_assign_lhs (stmt);
> +      gimple_stmt_iterator gsi = gsi_for_stmt (info->stmt);
> +      gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
> +      *orig_tp = repl;
> +    }
> +  else if (!useless_type_conversion_p (TREE_TYPE (*tp), TREE_TYPE (repl)))
>      {
>        tree vce = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (*tp), repl);
>        *tp = vce;
> @@ -11328,8 +11336,6 @@ ipa_simd_modify_stmt_ops (tree *tp, int
>      *tp = repl;
>  
>    info->modified = true;
> -  wi->is_lhs = false;
> -  wi->val_only = true;
>    return NULL_TREE;
>  }
>  
> @@ -11348,7 +11354,7 @@ ipa_simd_modify_function_body (struct cg
>  			       tree retval_array, tree iter)
>  {
>    basic_block bb;
> -  unsigned int i, j;
> +  unsigned int i, j, l;
>  
>    /* Re-use the adjustments array, but this time use it to replace
>       every function argument use to an offset into the corresponding
> @@ -11371,6 +11377,46 @@ ipa_simd_modify_function_body (struct cg
>  	j += node->simdclone->simdlen / TYPE_VECTOR_SUBPARTS (vectype) - 1;
>      }
>  
> +  l = adjustments.length ();
> +  for (i = 1; i < num_ssa_names; i++)
> +    {
> +      tree name = ssa_name (i);
> +      if (name
> +	  && SSA_NAME_VAR (name)
> +	  && TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL)
> +	{
> +	  for (j = 0; j < l; j++)
> +	    if (SSA_NAME_VAR (name) == adjustments[j].base
> +		&& adjustments[j].new_decl)
> +	      {
> +		tree base_var;
> +		if (adjustments[j].new_ssa_base == NULL_TREE)
> +		  {
> +		    base_var
> +		      = copy_var_decl (adjustments[j].base,
> +				       DECL_NAME (adjustments[j].base),
> +				       TREE_TYPE (adjustments[j].base));
> +		    adjustments[j].new_ssa_base = base_var;
> +		  }
> +		else
> +		  base_var = adjustments[j].new_ssa_base;
> +		if (SSA_NAME_IS_DEFAULT_DEF (name))
> +		  {
> +		    bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> +		    gimple_stmt_iterator gsi = gsi_after_labels (bb);
> +		    tree new_decl = unshare_expr (adjustments[j].new_decl);
> +		    set_ssa_default_def (cfun, adjustments[j].base, NULL_TREE);
> +		    SET_SSA_NAME_VAR_OR_IDENTIFIER (name, base_var);
> +		    SSA_NAME_IS_DEFAULT_DEF (name) = 0;
> +		    gimple stmt = gimple_build_assign (name, new_decl);
> +		    gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
> +		  }
> +		else
> +		  SET_SSA_NAME_VAR_OR_IDENTIFIER (name, base_var);
> +	      }
> +	}
> +    }
> +
>    struct modify_stmt_info info;
>    info.adjustments = adjustments;
>  
> --- gcc/tree-dfa.c.jj	2014-03-13 20:09:18.000000000 +0100
> +++ gcc/tree-dfa.c	2014-04-18 11:52:41.043248454 +0200
> @@ -343,7 +343,7 @@ set_ssa_default_def (struct function *fn
>      {
>        loc = htab_find_slot_with_hash (DEFAULT_DEFS (fn), &in,
>  				      DECL_UID (var), NO_INSERT);
> -      if (*loc)
> +      if (loc)
>  	{
>  	  SSA_NAME_IS_DEFAULT_DEF (*(tree *)loc) = false;
>  	  htab_clear_slot (DEFAULT_DEFS (fn), loc);
> --- gcc/testsuite/c-c++-common/gomp/pr60823-1.c.jj	2014-04-17 15:35:52.253884292 +0200
> +++ gcc/testsuite/c-c++-common/gomp/pr60823-1.c	2014-04-17 15:35:52.253884292 +0200
> @@ -0,0 +1,19 @@
> +/* PR tree-optimization/60823 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fopenmp-simd" } */
> +
> +#pragma omp declare simd simdlen(4) notinbranch
> +int
> +foo (const double c1, const double c2)
> +{
> +  double z1 = c1, z2 = c2;
> +  int res = 100, i;
> +
> +  for (i = 0; i < 100; i++)
> +    {
> +      res = (z1 * z1 + z2 * z2 > 4.0) ? (i < res ? i : res) : res;
> +      z1 = c1 + z1 * z1 - z2 * z2;
> +      z2 = c2 + 2.0 * z1 * z2;
> +    }
> +  return res;
> +}
> --- gcc/testsuite/c-c++-common/gomp/pr60823-2.c.jj	2014-04-17 15:35:52.254884210 +0200
> +++ gcc/testsuite/c-c++-common/gomp/pr60823-2.c	2014-04-17 15:35:52.253884292 +0200
> @@ -0,0 +1,43 @@
> +/* PR tree-optimization/60823 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fopenmp-simd" } */
> +
> +#pragma omp declare simd simdlen(4) notinbranch
> +__attribute__((noinline)) int
> +foo (double c1, double c2)
> +{
> +  double z1 = c1, z2 = c2;
> +  int res = 100, i;
> +
> +  for (i = 0; i < 5; i++)
> +    {
> +      res = (z1 * z1 + z2 * z2 > 4.0) ? (i < res ? i : res) : res;
> +      z1 = c1 + z1 * z1 - z2 * z2;
> +      z2 = c2 + 2.0 * z1 * z2;
> +      c1 += 0.5;
> +      c2 += 0.5;
> +    }
> +  return res;
> +}
> +
> +__attribute__((noinline, noclone)) void
> +bar (double *x, double *y)
> +{
> +  asm volatile ("" : : "rm" (x), "rm" (y) : "memory");
> +}
> +
> +int
> +main ()
> +{
> +  int i;
> +  double c[4] = { 0.0, 1.0, 0.0, 1.0 };
> +  double d[4] = { 0.0, 1.0, 2.0, 0.0 };
> +  int e[4];
> +  bar (c, d);
> +#pragma omp simd safelen(4)
> +  for (i = 0; i < 4; i++)
> +    e[i] = foo (c[i], d[i]);
> +  if (e[0] != 3 || e[1] != 1 || e[2] != 1 || e[3] != 2)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/c-c++-common/gomp/pr60823-3.c.jj	2014-04-17 16:43:42.580699699 +0200
> +++ gcc/testsuite/c-c++-common/gomp/pr60823-3.c	2014-04-17 16:42:59.000000000 +0200
> @@ -0,0 +1,32 @@
> +/* PR tree-optimization/60823 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fopenmp-simd -fno-strict-aliasing" } */
> +
> +void bar (char *, double *);
> +
> +#if __SIZEOF_DOUBLE__ >= 4
> +
> +struct S { char c[sizeof (double)]; };
> +void baz (struct S, struct S);
> +union U { struct S s; double d; };
> +
> +#pragma omp declare simd simdlen(4) notinbranch
> +__attribute__((noinline)) int
> +foo (double c1, double c2)
> +{
> +  double *a = &c1;
> +  char *b = (char *) &c1 + 2;
> +
> +  b[-2]++;
> +  b[1]--;
> +  *a++;
> +  c2++;
> +  bar ((char *) &c2 + 1, &c2);
> +  c2 *= 3.0;
> +  bar (b, a);
> +  baz (((union U) { .d = c1 }).s, ((union U) { .d = c2 }).s);
> +  baz (*(struct S *)&c1, *(struct S *)&c2);
> +  return c1 + c2 + ((struct S *)&c1)->c[1];
> +}
> +
> +#endif
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823)
  2014-04-18 15:00 [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823) Jakub Jelinek
  2014-04-22  8:15 ` Richard Biener
@ 2014-05-13  8:48 ` Rainer Orth
  2014-05-13 17:39   ` Rainer Orth
  1 sibling, 1 reply; 8+ messages in thread
From: Rainer Orth @ 2014-05-13  8:48 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:

> 2014-04-18  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR tree-optimization/60823
> 	* omp-low.c (ipa_simd_modify_function_body): Go through
> 	all SSA_NAMEs and for those refering to vector arguments
> 	which are going to be replaced adjust SSA_NAME_VAR and,
> 	if it is a default definition, change it into a non-default
> 	definition assigned at the beginning of function from new_decl.
> 	(ipa_simd_modify_stmt_ops): Rewritten.
> 	* tree-dfa.c (set_ssa_default_def): When removing default def,
> 	check for NULL loc instead of NULL *loc.
>
> 	* c-c++-common/gomp/pr60823-1.c: New test.
> 	* c-c++-common/gomp/pr60823-2.c: New test.
> 	* c-c++-common/gomp/pr60823-3.c: New test.

The second test FAILs on Solaris/x86 with Sun as:

ld.so.1: pr60823-2.exe: fatal: pr60823-2.exe: hardware capability (CA_SUNW_HW_1) unsupported: 0x20000000  [ AVX ]
FAIL: c-c++-common/gomp/pr60823-2.c execution test

If this is expected, I can extend the code in gcc.target/i386/i386.exp
(or rather move it to a new lib/clearcap.exp) to handle that via linker
maps.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823)
  2014-05-13  8:48 ` Rainer Orth
@ 2014-05-13 17:39   ` Rainer Orth
  2014-05-13 17:51     ` Jakub Jelinek
  0 siblings, 1 reply; 8+ messages in thread
From: Rainer Orth @ 2014-05-13 17:39 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:

> Jakub Jelinek <jakub@redhat.com> writes:
>
>> 2014-04-18  Jakub Jelinek  <jakub@redhat.com>
>>
>> 	PR tree-optimization/60823
>> 	* omp-low.c (ipa_simd_modify_function_body): Go through
>> 	all SSA_NAMEs and for those refering to vector arguments
>> 	which are going to be replaced adjust SSA_NAME_VAR and,
>> 	if it is a default definition, change it into a non-default
>> 	definition assigned at the beginning of function from new_decl.
>> 	(ipa_simd_modify_stmt_ops): Rewritten.
>> 	* tree-dfa.c (set_ssa_default_def): When removing default def,
>> 	check for NULL loc instead of NULL *loc.
>>
>> 	* c-c++-common/gomp/pr60823-1.c: New test.
>> 	* c-c++-common/gomp/pr60823-2.c: New test.
>> 	* c-c++-common/gomp/pr60823-3.c: New test.
>
> The second test FAILs on Solaris/x86 with Sun as:
>
> ld.so.1: pr60823-2.exe: fatal: pr60823-2.exe: hardware capability (CA_SUNW_HW_1) unsupported: 0x20000000  [ AVX ]
> FAIL: c-c++-common/gomp/pr60823-2.c execution test
>
> If this is expected, I can extend the code in gcc.target/i386/i386.exp
> (or rather move it to a new lib/clearcap.exp) to handle that via linker
> maps.

Something else seems to be amiss here: the new
libgomp.fortran/declare-simd-[12].f90 tests fail in just the same way,
although avx_runtime is false and they are only compiled with -msse2.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823)
  2014-05-13 17:39   ` Rainer Orth
@ 2014-05-13 17:51     ` Jakub Jelinek
  2014-05-15 12:37       ` Rainer Orth
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2014-05-13 17:51 UTC (permalink / raw)
  To: Rainer Orth; +Cc: Richard Biener, gcc-patches

On Tue, May 13, 2014 at 07:38:52PM +0200, Rainer Orth wrote:
> Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
> 
> > Jakub Jelinek <jakub@redhat.com> writes:
> >
> >> 2014-04-18  Jakub Jelinek  <jakub@redhat.com>
> >>
> >> 	PR tree-optimization/60823
> >> 	* omp-low.c (ipa_simd_modify_function_body): Go through
> >> 	all SSA_NAMEs and for those refering to vector arguments
> >> 	which are going to be replaced adjust SSA_NAME_VAR and,
> >> 	if it is a default definition, change it into a non-default
> >> 	definition assigned at the beginning of function from new_decl.
> >> 	(ipa_simd_modify_stmt_ops): Rewritten.
> >> 	* tree-dfa.c (set_ssa_default_def): When removing default def,
> >> 	check for NULL loc instead of NULL *loc.
> >>
> >> 	* c-c++-common/gomp/pr60823-1.c: New test.
> >> 	* c-c++-common/gomp/pr60823-2.c: New test.
> >> 	* c-c++-common/gomp/pr60823-3.c: New test.
> >
> > The second test FAILs on Solaris/x86 with Sun as:
> >
> > ld.so.1: pr60823-2.exe: fatal: pr60823-2.exe: hardware capability (CA_SUNW_HW_1) unsupported: 0x20000000  [ AVX ]
> > FAIL: c-c++-common/gomp/pr60823-2.c execution test
> >
> > If this is expected, I can extend the code in gcc.target/i386/i386.exp
> > (or rather move it to a new lib/clearcap.exp) to handle that via linker
> > maps.
> 
> Something else seems to be amiss here: the new
> libgomp.fortran/declare-simd-[12].f90 tests fail in just the same way,
> although avx_runtime is false and they are only compiled with -msse2.

If OpenMP declare simd doesn't work on Solaris/x86 (due to the bogus hw cap
stuff), then supposedly vect_simd_clones effective target should fail there.
OpenMP declare simd results in cloning of the functions for SSE2, AVX and
AVX2.

	Jakub

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823)
  2014-05-13 17:51     ` Jakub Jelinek
@ 2014-05-15 12:37       ` Rainer Orth
  2014-05-15 12:42         ` Jakub Jelinek
  0 siblings, 1 reply; 8+ messages in thread
From: Rainer Orth @ 2014-05-15 12:37 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:

> On Tue, May 13, 2014 at 07:38:52PM +0200, Rainer Orth wrote:
>> Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
>> 
>> > Jakub Jelinek <jakub@redhat.com> writes:
>> >
>> >> 2014-04-18  Jakub Jelinek  <jakub@redhat.com>
>> >>
>> >> 	PR tree-optimization/60823
>> >> 	* omp-low.c (ipa_simd_modify_function_body): Go through
>> >> 	all SSA_NAMEs and for those refering to vector arguments
>> >> 	which are going to be replaced adjust SSA_NAME_VAR and,
>> >> 	if it is a default definition, change it into a non-default
>> >> 	definition assigned at the beginning of function from new_decl.
>> >> 	(ipa_simd_modify_stmt_ops): Rewritten.
>> >> 	* tree-dfa.c (set_ssa_default_def): When removing default def,
>> >> 	check for NULL loc instead of NULL *loc.
>> >>
>> >> 	* c-c++-common/gomp/pr60823-1.c: New test.
>> >> 	* c-c++-common/gomp/pr60823-2.c: New test.
>> >> 	* c-c++-common/gomp/pr60823-3.c: New test.
>> >
>> > The second test FAILs on Solaris/x86 with Sun as:
>> >
>> > ld.so.1: pr60823-2.exe: fatal: pr60823-2.exe: hardware capability
>> > (CA_SUNW_HW_1) unsupported: 0x20000000 [ AVX ]
>> > FAIL: c-c++-common/gomp/pr60823-2.c execution test
>> >
>> > If this is expected, I can extend the code in gcc.target/i386/i386.exp
>> > (or rather move it to a new lib/clearcap.exp) to handle that via linker
>> > maps.
>> 
>> Something else seems to be amiss here: the new
>> libgomp.fortran/declare-simd-[12].f90 tests fail in just the same way,
>> although avx_runtime is false and they are only compiled with -msse2.
>
> If OpenMP declare simd doesn't work on Solaris/x86 (due to the bogus hw cap
> stuff), then supposedly vect_simd_clones effective target should fail there.

I don't think it's bogus: it guards against a similar kind of problems
as symbol versioning.  There, programs that depend on a missing
interface don't start to run instead of crashing in the middle of
execution when a function is missing.  With hwcap, programs depending on
insns not supported by the host don't even start running instead of
crashing later on.

Only if you start playing games with runtime selection of code based on
the host you're running on this gets you into trouble.  If this done
explicitly and you know what you are doing, you can still assemble with
-nH to suppress hwcaps, but if it happens implicitly, this is not an
option, at least not a user-friendly one.

> OpenMP declare simd results in cloning of the functions for SSE2, AVX and
> AVX2.

Ok, that's what I was missing.  ISTM that on Solaris with as/ld,
-fopenmp should trigger linking with a mapfile like the one currently
used in some places in the testsuite.  I envision turning this into a
general facility (-mclear-hwcap; -m is for target-specific options,
right?) which is automatically turned on by -fopenmp on Solaris.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823)
  2014-05-15 12:37       ` Rainer Orth
@ 2014-05-15 12:42         ` Jakub Jelinek
  2014-05-15 12:47           ` Rainer Orth
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2014-05-15 12:42 UTC (permalink / raw)
  To: Rainer Orth; +Cc: Richard Biener, gcc-patches

On Thu, May 15, 2014 at 02:37:33PM +0200, Rainer Orth wrote:
> > If OpenMP declare simd doesn't work on Solaris/x86 (due to the bogus hw cap
> > stuff), then supposedly vect_simd_clones effective target should fail there.
> 
> I don't think it's bogus: it guards against a similar kind of problems
> as symbol versioning.  There, programs that depend on a missing
> interface don't start to run instead of crashing in the middle of
> execution when a function is missing.  With hwcap, programs depending on
> insns not supported by the host don't even start running instead of
> crashing later on.

Runtime selection of code is very common though, which is why I think that
whole idea of hw cap flags checking is bogus.

> > OpenMP declare simd results in cloning of the functions for SSE2, AVX and
> > AVX2.
> 
> Ok, that's what I was missing.  ISTM that on Solaris with as/ld,
> -fopenmp should trigger linking with a mapfile like the one currently
> used in some places in the testsuite.  I envision turning this into a
> general facility (-mclear-hwcap; -m is for target-specific options,
> right?) which is automatically turned on by -fopenmp on Solaris.

Perhaps.

	Jakub

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823)
  2014-05-15 12:42         ` Jakub Jelinek
@ 2014-05-15 12:47           ` Rainer Orth
  0 siblings, 0 replies; 8+ messages in thread
From: Rainer Orth @ 2014-05-15 12:47 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:

> On Thu, May 15, 2014 at 02:37:33PM +0200, Rainer Orth wrote:
>> > If OpenMP declare simd doesn't work on Solaris/x86 (due to the bogus hw cap
>> > stuff), then supposedly vect_simd_clones effective target should fail there.
>> 
>> I don't think it's bogus: it guards against a similar kind of problems
>> as symbol versioning.  There, programs that depend on a missing
>> interface don't start to run instead of crashing in the middle of
>> execution when a function is missing.  With hwcap, programs depending on
>> insns not supported by the host don't even start running instead of
>> crashing later on.
>
> Runtime selection of code is very common though, which is why I think that
> whole idea of hw cap flags checking is bogus.

That may be changing, but what I observe far more often is code compiled
on one system with -mcpu/arch=native, later you try to execute it on a
different system and it crashes, giving no real indication why.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-05-15 12:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-18 15:00 [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823) Jakub Jelinek
2014-04-22  8:15 ` Richard Biener
2014-05-13  8:48 ` Rainer Orth
2014-05-13 17:39   ` Rainer Orth
2014-05-13 17:51     ` Jakub Jelinek
2014-05-15 12:37       ` Rainer Orth
2014-05-15 12:42         ` Jakub Jelinek
2014-05-15 12:47           ` Rainer Orth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).