public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
@ 2017-08-07  9:08 Jakub Jelinek
  2017-08-07 14:54 ` Jason Merrill
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2017-08-07  9:08 UTC (permalink / raw)
  To: Jason Merrill, Joseph S. Myers, Marek Polacek, Jonathan Wakely
  Cc: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 1059 bytes --]

Hi!

glibc for -ffast-math annotates a couple of math functions with simd
attribute, so that one can use vectorized versions with 4/8/16 vectorization
factor.

If one uses ::cos or ::cosf or std::cos(double), this works just fine, but not
when using std::cos(float).  This is because the libstdc++ headers call
__builtin_cosf, but the builtin function doesn't have the simd attribute,
only ::cosf does.

Attached are 2 patches to improve this.

The first one is a C/C++ FE change, which arranges that if we add simd
attribute to say ::cosf, then calls to __builtin_cosf will act as if
__builtin_cosf also has the attribute.  While other attributes aren't
handled this way, perhaps a small precedent to such change is that if
somebody uses typeof (cosf) cosf __asm ("foobar"); then calls to
__builtin_cosf if they expand into a library call will call foobar, not
cosf.

The other patch is instead a libstdc++ change, not using __builtin_cosf
etc., but ::cosf.

Both patches have been (separately) bootstrapped/regtested on x86_64-linux
and i686-linux.

	Jakub

[-- Attachment #2: U674 --]
[-- Type: text/plain, Size: 4848 bytes --]

2017-08-07  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/81706
	* tree.c (attribute_value_equal): Use omp_declare_simd_clauses_equal
	for comparison of OMP_CLAUSEs regardless of flag_openmp{,_simd}.

	* c-decl.c (merge_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* decl.c (duplicate_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* gcc.target/i386/pr81706.c: New test.
	* g++.dg/ext/pr81706.C: New test.

--- gcc/tree.c.jj	2017-07-29 09:48:40.000000000 +0200
+++ gcc/tree.c	2017-08-04 12:06:35.636072718 +0200
@@ -5022,8 +5022,8 @@ attribute_value_equal (const_tree attr1,
 				     TREE_VALUE (attr2)) == 1);
     }
 
-  if ((flag_openmp || flag_openmp_simd)
-      && TREE_VALUE (attr1) && TREE_VALUE (attr2)
+  if (TREE_VALUE (attr1)
+      && TREE_VALUE (attr2)
       && TREE_CODE (TREE_VALUE (attr1)) == OMP_CLAUSE
       && TREE_CODE (TREE_VALUE (attr2)) == OMP_CLAUSE)
     return omp_declare_simd_clauses_equal (TREE_VALUE (attr1),
--- gcc/c/c-decl.c.jj	2017-07-31 11:31:15.000000000 +0200
+++ gcc/c/c-decl.c	2017-08-04 12:39:48.113226134 +0200
@@ -2566,6 +2566,36 @@ merge_decls (tree newdecl, tree olddecl,
 			set_builtin_decl_declared_p (fncode, true);
 		      break;
 		    }
+
+		  tree s = lookup_attribute ("omp declare simd",
+					     DECL_ATTRIBUTES (newdecl));
+		  if (s)
+		    {
+		      tree b
+			= builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
+		      if (b)
+			{
+			  tree s2 = lookup_attribute ("omp declare simd",
+						      DECL_ATTRIBUTES (b));
+			  while (s)
+			    {
+			      tree s3;
+			      for (s3 = s2; s3;
+				   s3 = lookup_attribute ("omp declare simd",
+							  TREE_CHAIN (s3)))
+				if (attribute_value_equal (s, s3))
+				  break;
+			      if (!s3)
+				{
+				  s3 = copy_node (s);
+				  TREE_CHAIN (s3) = DECL_ATTRIBUTES (b);
+				  DECL_ATTRIBUTES (b) = s3;
+				}
+			      s = lookup_attribute ("omp declare simd",
+						    TREE_CHAIN (s));
+			    }
+			}
+		    }
 		}
 	    }
 	  else
--- gcc/cp/decl.c.jj	2017-08-01 19:23:10.000000000 +0200
+++ gcc/cp/decl.c	2017-08-04 12:44:44.773780568 +0200
@@ -2456,6 +2456,35 @@ next_arg:;
 		  break;
 		}
 	    }
+
+	  tree s = lookup_attribute ("omp declare simd",
+				     DECL_ATTRIBUTES (newdecl));
+	  if (s)
+	    {
+	      tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
+	      if (b)
+		{
+		  tree s2 = lookup_attribute ("omp declare simd",
+					      DECL_ATTRIBUTES (b));
+		  while (s)
+		    {
+		      tree s3;
+		      for (s3 = s2; s3;
+			   s3 = lookup_attribute ("omp declare simd",
+						  TREE_CHAIN (s3)))
+			if (attribute_value_equal (s, s3))
+			  break;
+		      if (!s3)
+			{
+			  s3 = copy_node (s);
+			  TREE_CHAIN (s3) = DECL_ATTRIBUTES (b);
+			  DECL_ATTRIBUTES (b) = s3;
+			}
+		      s = lookup_attribute ("omp declare simd",
+					    TREE_CHAIN (s));
+		    }
+		}
+	    }
 	}
       if (new_defines_function)
 	/* If defining a function declared with other language
--- gcc/testsuite/gcc.target/i386/pr81706.c.jj	2017-08-06 23:50:46.511337565 +0200
+++ gcc/testsuite/gcc.target/i386/pr81706.c	2017-08-06 23:50:35.000000000 +0200
@@ -0,0 +1,32 @@
+/* PR libstdc++/81706 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx2 -mno-avx512f" } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } } */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}
--- gcc/testsuite/g++.dg/ext/pr81706.C.jj	2017-08-06 23:51:09.318065575 +0200
+++ gcc/testsuite/g++.dg/ext/pr81706.C	2017-08-06 23:51:38.577716630 +0200
@@ -0,0 +1,32 @@
+// PR libstdc++/81706
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-options "-O3 -mavx2 -mno-avx512f" }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } }
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}

[-- Attachment #3: U675 --]
[-- Type: text/plain, Size: 1772 bytes --]

2017-08-07  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/81706
	* include/c_global/cmath (std::cos, std::exp, std::log,
	std::pow, std::sin): Call ::FNf instead of __builtin_FNf in
	float overloads.

--- libstdc++-v3/include/c_global/cmath.jj	2017-07-24 10:57:58.000000000 +0200
+++ libstdc++-v3/include/c_global/cmath	2017-08-04 10:16:59.484637951 +0200
@@ -182,7 +182,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #ifndef __CORRECT_ISO_CPP_MATH_H_PROTO
   inline _GLIBCXX_CONSTEXPR float
   cos(float __x)
-  { return __builtin_cosf(__x); }
+  { return ::cosf(__x); }
 
   inline _GLIBCXX_CONSTEXPR long double
   cos(long double __x)
@@ -220,7 +220,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #ifndef __CORRECT_ISO_CPP_MATH_H_PROTO
   inline _GLIBCXX_CONSTEXPR float
   exp(float __x)
-  { return __builtin_expf(__x); }
+  { return ::expf(__x); }
 
   inline _GLIBCXX_CONSTEXPR long double
   exp(long double __x)
@@ -336,7 +336,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #ifndef __CORRECT_ISO_CPP_MATH_H_PROTO
   inline _GLIBCXX_CONSTEXPR float
   log(float __x)
-  { return __builtin_logf(__x); }
+  { return ::logf(__x); }
 
   inline _GLIBCXX_CONSTEXPR long double
   log(long double __x)
@@ -386,7 +386,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #ifndef __CORRECT_ISO_CPP_MATH_H_PROTO
   inline _GLIBCXX_CONSTEXPR float
   pow(float __x, float __y)
-  { return __builtin_powf(__x, __y); }
+  { return ::powf(__x, __y); }
 
   inline _GLIBCXX_CONSTEXPR long double
   pow(long double __x, long double __y)
@@ -423,7 +423,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #ifndef __CORRECT_ISO_CPP_MATH_H_PROTO
   inline _GLIBCXX_CONSTEXPR float
   sin(float __x)
-  { return __builtin_sinf(__x); }
+  { return ::sinf(__x); }
 
   inline _GLIBCXX_CONSTEXPR long double
   sin(long double __x)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-08-07  9:08 [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706) Jakub Jelinek
@ 2017-08-07 14:54 ` Jason Merrill
  2017-08-07 15:28   ` Jakub Jelinek
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Merrill @ 2017-08-07 14:54 UTC (permalink / raw)
  To: Jakub Jelinek, Joseph S. Myers, Marek Polacek, Jonathan Wakely
  Cc: gcc-patches, libstdc++

On 08/07/2017 05:08 AM, Jakub Jelinek wrote:
> +		  tree s = lookup_attribute ("omp declare simd",
> +					     DECL_ATTRIBUTES (newdecl));
> +		  if (s)
> +		    {
> +		      tree b
> +			= builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
> +		      if (b)
> +			{
> +			  tree s2 = lookup_attribute ("omp declare simd",
> +						      DECL_ATTRIBUTES (b));
> +			  while (s)
> +			    {
> +			      tree s3;
> +			      for (s3 = s2; s3;
> +				   s3 = lookup_attribute ("omp declare simd",
> +							  TREE_CHAIN (s3)))
> +				if (attribute_value_equal (s, s3))
> +				  break;
> +			      if (!s3)
> +				{
> +				  s3 = copy_node (s);
> +				  TREE_CHAIN (s3) = DECL_ATTRIBUTES (b);
> +				  DECL_ATTRIBUTES (b) = s3;
> +				}
> +			      s = lookup_attribute ("omp declare simd",
> +						    TREE_CHAIN (s));
> +			    }
> +			}
> +		    }

This should really be a separate function.  Perhaps "merge_one_attribute"?

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-08-07 14:54 ` Jason Merrill
@ 2017-08-07 15:28   ` Jakub Jelinek
  2017-08-07 20:59     ` Jonathan Wakely
  2017-09-01 11:13     ` Jakub Jelinek
  0 siblings, 2 replies; 16+ messages in thread
From: Jakub Jelinek @ 2017-08-07 15:28 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely, gcc-patches, libstdc++

On Mon, Aug 07, 2017 at 10:54:18AM -0400, Jason Merrill wrote:
> On 08/07/2017 05:08 AM, Jakub Jelinek wrote:
> > +		  tree s = lookup_attribute ("omp declare simd",
> > +					     DECL_ATTRIBUTES (newdecl));
> > +		  if (s)
> > +		    {
> > +		      tree b
> > +			= builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
> > +		      if (b)
> > +			{
> > +			  tree s2 = lookup_attribute ("omp declare simd",
> > +						      DECL_ATTRIBUTES (b));
> > +			  while (s)
> > +			    {
> > +			      tree s3;
> > +			      for (s3 = s2; s3;
> > +				   s3 = lookup_attribute ("omp declare simd",
> > +							  TREE_CHAIN (s3)))
> > +				if (attribute_value_equal (s, s3))
> > +				  break;
> > +			      if (!s3)
> > +				{
> > +				  s3 = copy_node (s);
> > +				  TREE_CHAIN (s3) = DECL_ATTRIBUTES (b);
> > +				  DECL_ATTRIBUTES (b) = s3;
> > +				}
> > +			      s = lookup_attribute ("omp declare simd",
> > +						    TREE_CHAIN (s));
> > +			    }
> > +			}
> > +		    }
> 
> This should really be a separate function.  Perhaps "merge_one_attribute"?

If it is outlined without the first 7 lines, i.e. just the body of if (b),
then it could be duplicate_one_attribute (tree *, tree, const char *);
called like if (b) duplicate_one_attribute (&DECL_ATTRIBUTES (b), s, "omp declare simd");
If it is duplicated as whole, it should be called
duplicate_one_attr_to_builtin or something similar.
In any case, it could be in tree.c or attribs.c.

The primary question is if we want this behavior, or if we should go the
libstdc++ patch routine (and for Jonathan the question is if he knows
why __builtin_XXXf has been used there rather than the ::XXXf).

	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-08-07 15:28   ` Jakub Jelinek
@ 2017-08-07 20:59     ` Jonathan Wakely
  2017-08-07 21:02       ` Jakub Jelinek
  2017-09-01 11:13     ` Jakub Jelinek
  1 sibling, 1 reply; 16+ messages in thread
From: Jonathan Wakely @ 2017-08-07 20:59 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, Joseph S. Myers, Marek Polacek, gcc-patches, libstdc++

On 07/08/17 17:27 +0200, Jakub Jelinek wrote:
>On Mon, Aug 07, 2017 at 10:54:18AM -0400, Jason Merrill wrote:
>> On 08/07/2017 05:08 AM, Jakub Jelinek wrote:
>> > +		  tree s = lookup_attribute ("omp declare simd",
>> > +					     DECL_ATTRIBUTES (newdecl));
>> > +		  if (s)
>> > +		    {
>> > +		      tree b
>> > +			= builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
>> > +		      if (b)
>> > +			{
>> > +			  tree s2 = lookup_attribute ("omp declare simd",
>> > +						      DECL_ATTRIBUTES (b));
>> > +			  while (s)
>> > +			    {
>> > +			      tree s3;
>> > +			      for (s3 = s2; s3;
>> > +				   s3 = lookup_attribute ("omp declare simd",
>> > +							  TREE_CHAIN (s3)))
>> > +				if (attribute_value_equal (s, s3))
>> > +				  break;
>> > +			      if (!s3)
>> > +				{
>> > +				  s3 = copy_node (s);
>> > +				  TREE_CHAIN (s3) = DECL_ATTRIBUTES (b);
>> > +				  DECL_ATTRIBUTES (b) = s3;
>> > +				}
>> > +			      s = lookup_attribute ("omp declare simd",
>> > +						    TREE_CHAIN (s));
>> > +			    }
>> > +			}
>> > +		    }
>>
>> This should really be a separate function.  Perhaps "merge_one_attribute"?
>
>If it is outlined without the first 7 lines, i.e. just the body of if (b),
>then it could be duplicate_one_attribute (tree *, tree, const char *);
>called like if (b) duplicate_one_attribute (&DECL_ATTRIBUTES (b), s, "omp declare simd");
>If it is duplicated as whole, it should be called
>duplicate_one_attr_to_builtin or something similar.
>In any case, it could be in tree.c or attribs.c.
>
>The primary question is if we want this behavior, or if we should go the
>libstdc++ patch routine (and for Jonathan the question is if he knows
>why __builtin_XXXf has been used there rather than the ::XXXf).

I don't know for certain, but I suspect it's because sinf, cosf, powf
etc. were new in C99, so a strict libc might not declare them in C++98
mode.

By using __builtin_sinf we don't need a declaration of sinf, we only
require the definition to exist in libc or libm. If the function is
present, but not declared for C++98, then it works.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-08-07 20:59     ` Jonathan Wakely
@ 2017-08-07 21:02       ` Jakub Jelinek
  2017-08-07 21:58         ` Jonathan Wakely
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2017-08-07 21:02 UTC (permalink / raw)
  To: Jonathan Wakely
  Cc: Jason Merrill, Joseph S. Myers, Marek Polacek, gcc-patches, libstdc++

On Mon, Aug 07, 2017 at 09:59:04PM +0100, Jonathan Wakely wrote:
> > If it is outlined without the first 7 lines, i.e. just the body of if (b),
> > then it could be duplicate_one_attribute (tree *, tree, const char *);
> > called like if (b) duplicate_one_attribute (&DECL_ATTRIBUTES (b), s, "omp declare simd");
> > If it is duplicated as whole, it should be called
> > duplicate_one_attr_to_builtin or something similar.
> > In any case, it could be in tree.c or attribs.c.
> > 
> > The primary question is if we want this behavior, or if we should go the
> > libstdc++ patch routine (and for Jonathan the question is if he knows
> > why __builtin_XXXf has been used there rather than the ::XXXf).
> 
> I don't know for certain, but I suspect it's because sinf, cosf, powf
> etc. were new in C99, so a strict libc might not declare them in C++98
> mode.
> 
> By using __builtin_sinf we don't need a declaration of sinf, we only
> require the definition to exist in libc or libm. If the function is
> present, but not declared for C++98, then it works.

Ah, so if we go the libstdc++ patch route, we'd need to check in configure
for the prototypes in C++98 mode and guard the stuff with #ifdef
_GLIBCXX_HAVE_COSF or similar, right?  Or do this only for C++11 and above.

	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-08-07 21:02       ` Jakub Jelinek
@ 2017-08-07 21:58         ` Jonathan Wakely
  0 siblings, 0 replies; 16+ messages in thread
From: Jonathan Wakely @ 2017-08-07 21:58 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, Joseph S. Myers, Marek Polacek, gcc-patches, libstdc++

On 07/08/17 23:02 +0200, Jakub Jelinek wrote:
>On Mon, Aug 07, 2017 at 09:59:04PM +0100, Jonathan Wakely wrote:
>> > If it is outlined without the first 7 lines, i.e. just the body of if (b),
>> > then it could be duplicate_one_attribute (tree *, tree, const char *);
>> > called like if (b) duplicate_one_attribute (&DECL_ATTRIBUTES (b), s, "omp declare simd");
>> > If it is duplicated as whole, it should be called
>> > duplicate_one_attr_to_builtin or something similar.
>> > In any case, it could be in tree.c or attribs.c.
>> >
>> > The primary question is if we want this behavior, or if we should go the
>> > libstdc++ patch routine (and for Jonathan the question is if he knows
>> > why __builtin_XXXf has been used there rather than the ::XXXf).
>>
>> I don't know for certain, but I suspect it's because sinf, cosf, powf
>> etc. were new in C99, so a strict libc might not declare them in C++98
>> mode.
>>
>> By using __builtin_sinf we don't need a declaration of sinf, we only
>> require the definition to exist in libc or libm. If the function is
>> present, but not declared for C++98, then it works.
>
>Ah, so if we go the libstdc++ patch route, we'd need to check in configure
>for the prototypes in C++98 mode and guard the stuff with #ifdef
>_GLIBCXX_HAVE_COSF or similar, right?  Or do this only for C++11 and above.

Yes. If we can assume that libc will declare those functions when
__cplusplus >= 201103L then we could do:

  inline _GLIBCXX_CONSTEXPR float
  cos(float __x)
  {
#if __cplusplus >= 201103L
    return ::cosf(__x);
#else
    return __builtin_cosf(__x);
#endif
  }

Alternatively we could check _GLIBCXX_USE_C99_MATH instead:

#if _GLIBCXX_USE_C99_MATH
    return ::cosf(__x);
#else
    return __builtin_cosf(__x);
#endif

That macro expands to either _GLIBCXX98_USE_C99_MATH or
_GLIBCXX11_USE_C99_MATH depending on the value of __cplusplus and
tells us if the C99 <math.h> functions are declared for the current
-std mode. The value of that macro depends on whether the following
are available in <math.h>:

        [i = fpclassify(d1);
         i = isfinite(d1);
         i = isinf(d1);
         i = isnan(d1);
         i = isnormal(d1);
         i = signbit(d1);
         i = isgreater(d1, d2);
         i = isgreaterequal(d1, d2);
         i = isless(d1, d2);
         i = islessequal(d1, d2);
         i = islessgreater(d1, d2);
         i = islessgreater(d1, d2);
         i = isunordered(d1, d2);

If it's possible that they could be declared but sinf/cosf/etc. are
not, then we'd need something like a new _GLIBCXX_HAVE_C99_MATH_FL
macro to say the C99 float and long double functions are present.

We may need that new macro anyway, to fix PR 79700.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-08-07 15:28   ` Jakub Jelinek
  2017-08-07 20:59     ` Jonathan Wakely
@ 2017-09-01 11:13     ` Jakub Jelinek
  2017-09-09 13:43       ` Jason Merrill
  1 sibling, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2017-09-01 11:13 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely, gcc-patches, libstdc++

On Mon, Aug 07, 2017 at 05:27:42PM +0200, Jakub Jelinek wrote:
> > This should really be a separate function.  Perhaps "merge_one_attribute"?
> 
> If it is outlined without the first 7 lines, i.e. just the body of if (b),
> then it could be duplicate_one_attribute (tree *, tree, const char *);
> called like if (b) duplicate_one_attribute (&DECL_ATTRIBUTES (b), s, "omp declare simd");
> If it is duplicated as whole, it should be called
> duplicate_one_attr_to_builtin or something similar.
> In any case, it could be in tree.c or attribs.c.
> 
> The primary question is if we want this behavior, or if we should go the
> libstdc++ patch routine (and for Jonathan the question is if he knows
> why __builtin_XXXf has been used there rather than the ::XXXf).

Here is updated patch that commonizes big part of that into
duplicate_one_attribute.  Bootstrapped/regtested on x86_64-linux and
i686-linux.  The question stands, do we want to go this way or using some
libstdc++ solution?

2017-09-01  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/81706
	* attribs.c (attribute_value_equal): Use omp_declare_simd_clauses_equal
	for comparison of OMP_CLAUSEs regardless of flag_openmp{,_simd}.
	(duplicate_one_attribute): New function.
	* attribs.h (duplicate_one_attribute): New declaration.

	* c-decl.c (merge_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* decl.c (duplicate_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* gcc.target/i386/pr81706.c: New test.
	* g++.dg/ext/pr81706.C: New test.

--- gcc/attribs.c.jj	2017-09-01 09:25:37.000000000 +0200
+++ gcc/attribs.c	2017-09-01 09:54:28.581146071 +0200
@@ -1116,9 +1116,9 @@ attribute_value_equal (const_tree attr1,
 				     TREE_VALUE (attr2)) == 1);
     }
 
-  if ((flag_openmp || flag_openmp_simd)
-      && TREE_VALUE (attr1) && TREE_VALUE (attr2)
+  if (TREE_VALUE (attr1)
       && TREE_CODE (TREE_VALUE (attr1)) == OMP_CLAUSE
+      && TREE_VALUE (attr2)
       && TREE_CODE (TREE_VALUE (attr2)) == OMP_CLAUSE)
     return omp_declare_simd_clauses_equal (TREE_VALUE (attr1),
 					   TREE_VALUE (attr2));
@@ -1310,6 +1310,31 @@ merge_decl_attributes (tree olddecl, tre
 			   DECL_ATTRIBUTES (newdecl));
 }
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+void
+duplicate_one_attribute (tree *attrs, tree attr, const char *name)
+{
+  if (!attr)
+    return;
+  tree a = lookup_attribute (name, *attrs);
+  while (attr)
+    {
+      tree a2;
+      for (a2 = a; a2; a2 = lookup_attribute (name, TREE_CHAIN (a2)))
+	if (attribute_value_equal (attr, a2))
+	  break;
+      if (!a2)
+	{
+	  a2 = copy_node (attr);
+	  TREE_CHAIN (a2) = *attrs;
+	  *attrs = a2;
+	}
+      attr = lookup_attribute (name, TREE_CHAIN (attr));
+    }
+}
+
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
 
 /* Specialization of merge_decl_attributes for various Windows targets.
--- gcc/attribs.h.jj	2017-09-01 09:25:37.000000000 +0200
+++ gcc/attribs.h	2017-09-01 09:54:57.366809765 +0200
@@ -77,6 +77,11 @@ extern tree remove_attribute (const char
 
 extern tree merge_attributes (tree, tree);
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+extern void duplicate_one_attribute (tree *, tree, const char *);
+
 /* Given two Windows decl attributes lists, possibly including
    dllimport, return a list of their union .  */
 extern tree merge_dllimport_decl_attributes (tree, tree);
--- gcc/c/c-decl.c.jj	2017-09-01 09:25:40.707410605 +0200
+++ gcc/c/c-decl.c	2017-09-01 09:49:57.419314085 +0200
@@ -2569,6 +2569,17 @@ merge_decls (tree newdecl, tree olddecl,
 			set_builtin_decl_declared_p (fncode, true);
 		      break;
 		    }
+
+		  tree s = lookup_attribute ("omp declare simd",
+					     DECL_ATTRIBUTES (newdecl));
+		  if (s)
+		    {
+		      tree b
+			= builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
+		      if (b)
+			duplicate_one_attribute (&DECL_ATTRIBUTES (b), s,
+						 "omp declare simd");
+		    }
 		}
 	    }
 	  else
--- gcc/cp/decl.c.jj	2017-09-01 09:26:24.748892739 +0200
+++ gcc/cp/decl.c	2017-09-01 09:55:52.940160495 +0200
@@ -2470,6 +2470,16 @@ next_arg:;
 		  break;
 		}
 	    }
+
+	  tree s = lookup_attribute ("omp declare simd",
+				     DECL_ATTRIBUTES (newdecl));
+	  if (s)
+	    {
+	      tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
+	      if (b)
+		duplicate_one_attribute (&DECL_ATTRIBUTES (b), s,
+					 "omp declare simd");
+	    }
 	}
       if (new_defines_function)
 	/* If defining a function declared with other language
--- gcc/testsuite/gcc.target/i386/pr81706.c.jj	2017-09-01 09:40:46.345761553 +0200
+++ gcc/testsuite/gcc.target/i386/pr81706.c	2017-09-01 09:40:46.345761553 +0200
@@ -0,0 +1,32 @@
+/* PR libstdc++/81706 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx2 -mno-avx512f" } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } } */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}
--- gcc/testsuite/g++.dg/ext/pr81706.C.jj	2017-09-01 09:40:46.345761553 +0200
+++ gcc/testsuite/g++.dg/ext/pr81706.C	2017-09-01 09:40:46.345761553 +0200
@@ -0,0 +1,32 @@
+// PR libstdc++/81706
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-options "-O3 -mavx2 -mno-avx512f" }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } }
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}


	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-09-01 11:13     ` Jakub Jelinek
@ 2017-09-09 13:43       ` Jason Merrill
  2017-09-12  7:49         ` Jakub Jelinek
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Merrill @ 2017-09-09 13:43 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely,
	gcc-patches List, libstdc++

On Fri, Sep 1, 2017 at 1:12 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> +                 tree s = lookup_attribute ("omp declare simd",
> +                                            DECL_ATTRIBUTES (newdecl));
> +                 if (s)
> +                   {
> +                     tree b
> +                       = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
> +                     if (b)
> +                       duplicate_one_attribute (&DECL_ATTRIBUTES (b), s,
> +                                                "omp declare simd");
> +                   }

Is there a reason not to set b first and move the lookup of s into the
function as well?

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-09-09 13:43       ` Jason Merrill
@ 2017-09-12  7:49         ` Jakub Jelinek
  2017-09-29 12:32           ` Jakub Jelinek
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2017-09-12  7:49 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely,
	gcc-patches List, libstdc++

On Sat, Sep 09, 2017 at 03:42:35PM +0200, Jason Merrill wrote:
> On Fri, Sep 1, 2017 at 1:12 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> > +                 tree s = lookup_attribute ("omp declare simd",
> > +                                            DECL_ATTRIBUTES (newdecl));
> > +                 if (s)
> > +                   {
> > +                     tree b
> > +                       = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
> > +                     if (b)
> > +                       duplicate_one_attribute (&DECL_ATTRIBUTES (b), s,
> > +                                                "omp declare simd");
> > +                   }
> 
> Is there a reason not to set b first and move the lookup of s into the
> function as well?

I wanted to handle the most common case (no DECL_ATTRIBUTES at all) and the
second most common case (lookup_attribute returning NULL) inline, otherwise
we'll do an extra function call for all builtins in all cases.
But if you strongly prefer that lookup to be in duplicate_one_attribute,
I can change the patch and retest.

	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-09-12  7:49         ` Jakub Jelinek
@ 2017-09-29 12:32           ` Jakub Jelinek
  2017-09-29 20:17             ` Joseph Myers
  2017-10-24 15:06             ` Jason Merrill
  0 siblings, 2 replies; 16+ messages in thread
From: Jakub Jelinek @ 2017-09-29 12:32 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely,
	gcc-patches List, libstdc++

On Tue, Sep 12, 2017 at 09:49:20AM +0200, Jakub Jelinek wrote:
> On Sat, Sep 09, 2017 at 03:42:35PM +0200, Jason Merrill wrote:
> > On Fri, Sep 1, 2017 at 1:12 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> > > +                 tree s = lookup_attribute ("omp declare simd",
> > > +                                            DECL_ATTRIBUTES (newdecl));
> > > +                 if (s)
> > > +                   {
> > > +                     tree b
> > > +                       = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
> > > +                     if (b)
> > > +                       duplicate_one_attribute (&DECL_ATTRIBUTES (b), s,
> > > +                                                "omp declare simd");
> > > +                   }
> > 
> > Is there a reason not to set b first and move the lookup of s into the
> > function as well?
> 
> I wanted to handle the most common case (no DECL_ATTRIBUTES at all) and the
> second most common case (lookup_attribute returning NULL) inline, otherwise
> we'll do an extra function call for all builtins in all cases.
> But if you strongly prefer that lookup to be in duplicate_one_attribute,
> I can change the patch and retest.

Here is a patch which does that.  Also bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?

2017-09-29  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/81706
	* attribs.c (attribute_value_equal): Use omp_declare_simd_clauses_equal
	for comparison of OMP_CLAUSEs regardless of flag_openmp{,_simd}.
	(duplicate_one_attribute): New function.
	* attribs.h (duplicate_one_attribute): New declaration.

	* c-decl.c (merge_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* decl.c (duplicate_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* gcc.target/i386/pr81706.c: New test.
	* g++.dg/ext/pr81706.C: New test.

--- gcc/attribs.c.jj	2017-09-12 21:57:57.872456129 +0200
+++ gcc/attribs.c	2017-09-29 11:20:02.163793301 +0200
@@ -1125,9 +1125,9 @@ attribute_value_equal (const_tree attr1,
 				     TREE_VALUE (attr2)) == 1);
     }
 
-  if ((flag_openmp || flag_openmp_simd)
-      && TREE_VALUE (attr1) && TREE_VALUE (attr2)
+  if (TREE_VALUE (attr1)
       && TREE_CODE (TREE_VALUE (attr1)) == OMP_CLAUSE
+      && TREE_VALUE (attr2)
       && TREE_CODE (TREE_VALUE (attr2)) == OMP_CLAUSE)
     return omp_declare_simd_clauses_equal (TREE_VALUE (attr1),
 					   TREE_VALUE (attr2));
@@ -1319,6 +1319,32 @@ merge_decl_attributes (tree olddecl, tre
 			   DECL_ATTRIBUTES (newdecl));
 }
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+void
+duplicate_one_attribute (tree *attrs, tree attr, const char *name)
+{
+  attr = lookup_attribute (name, attr);
+  if (!attr)
+    return;
+  tree a = lookup_attribute (name, *attrs);
+  while (attr)
+    {
+      tree a2;
+      for (a2 = a; a2; a2 = lookup_attribute (name, TREE_CHAIN (a2)))
+	if (attribute_value_equal (attr, a2))
+	  break;
+      if (!a2)
+	{
+	  a2 = copy_node (attr);
+	  TREE_CHAIN (a2) = *attrs;
+	  *attrs = a2;
+	}
+      attr = lookup_attribute (name, TREE_CHAIN (attr));
+    }
+}
+
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
 
 /* Specialization of merge_decl_attributes for various Windows targets.
--- gcc/attribs.h.jj	2017-09-12 21:57:57.872456129 +0200
+++ gcc/attribs.h	2017-09-29 11:19:27.965206860 +0200
@@ -77,6 +77,11 @@ extern tree remove_attribute (const char
 
 extern tree merge_attributes (tree, tree);
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+extern void duplicate_one_attribute (tree *, tree, const char *);
+
 /* Given two Windows decl attributes lists, possibly including
    dllimport, return a list of their union .  */
 extern tree merge_dllimport_decl_attributes (tree, tree);
--- gcc/c/c-decl.c.jj	2017-09-29 09:57:12.582423577 +0200
+++ gcc/c/c-decl.c	2017-09-29 11:21:17.525881957 +0200
@@ -2569,6 +2569,13 @@ merge_decls (tree newdecl, tree olddecl,
 			set_builtin_decl_declared_p (fncode, true);
 		      break;
 		    }
+
+		  tree b
+		    = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
+		  if (b)
+		    duplicate_one_attribute (&DECL_ATTRIBUTES (b),
+					     DECL_ATTRIBUTES (newdecl),
+					     "omp declare simd");
 		}
 	    }
 	  else
--- gcc/cp/decl.c.jj	2017-09-29 09:07:33.530064213 +0200
+++ gcc/cp/decl.c	2017-09-29 11:21:55.618421309 +0200
@@ -2470,6 +2470,12 @@ next_arg:;
 		  break;
 		}
 	    }
+
+	  tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
+	  if (b)
+	    duplicate_one_attribute (&DECL_ATTRIBUTES (b),
+				     DECL_ATTRIBUTES (newdecl),
+				     "omp declare simd");
 	}
       if (new_defines_function)
 	/* If defining a function declared with other language
--- gcc/testsuite/gcc.target/i386/pr81706.c.jj	2017-09-29 11:19:27.971206787 +0200
+++ gcc/testsuite/gcc.target/i386/pr81706.c	2017-09-29 11:19:27.971206787 +0200
@@ -0,0 +1,32 @@
+/* PR libstdc++/81706 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx2 -mno-avx512f" } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } } */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}
--- gcc/testsuite/g++.dg/ext/pr81706.C.jj	2017-09-29 11:19:27.972206775 +0200
+++ gcc/testsuite/g++.dg/ext/pr81706.C	2017-09-29 11:19:27.972206775 +0200
@@ -0,0 +1,32 @@
+// PR libstdc++/81706
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-options "-O3 -mavx2 -mno-avx512f" }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } }
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}


	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-09-29 12:32           ` Jakub Jelinek
@ 2017-09-29 20:17             ` Joseph Myers
  2017-10-24 15:06             ` Jason Merrill
  1 sibling, 0 replies; 16+ messages in thread
From: Joseph Myers @ 2017-09-29 20:17 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, Marek Polacek, Jonathan Wakely, gcc-patches List,
	libstdc++

On Fri, 29 Sep 2017, Jakub Jelinek wrote:

> Here is a patch which does that.  Also bootstrapped/regtested on
> x86_64-linux and i686-linux, ok for trunk?

> 	* c-decl.c (merge_decls): Copy "omp declare simd" attributes from
> 	newdecl to corresponding __builtin_ if any.

The C front-end changes are OK.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-09-29 12:32           ` Jakub Jelinek
  2017-09-29 20:17             ` Joseph Myers
@ 2017-10-24 15:06             ` Jason Merrill
  2017-10-24 15:34               ` Jakub Jelinek
  1 sibling, 1 reply; 16+ messages in thread
From: Jason Merrill @ 2017-10-24 15:06 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely,
	gcc-patches List, libstdc++

On 09/29/2017 08:32 AM, Jakub Jelinek wrote:
> +	  tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
> +	  if (b)
> +	    duplicate_one_attribute (&DECL_ATTRIBUTES (b),
> +				     DECL_ATTRIBUTES (newdecl),
> +				     "omp declare simd");

It occurs to me that we're likely to want to propagate other attributes 
to the builtin, too.  In the testcase, nothrow and leaf also seem 
appropriate.  Do we want a broader copy_attributes_to_builtin function, 
even if it only copies this omp attribute for now?

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-10-24 15:06             ` Jason Merrill
@ 2017-10-24 15:34               ` Jakub Jelinek
  2017-10-24 17:56                 ` Jason Merrill
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2017-10-24 15:34 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely,
	gcc-patches List, libstdc++

On Tue, Oct 24, 2017 at 11:06:51AM -0400, Jason Merrill wrote:
> On 09/29/2017 08:32 AM, Jakub Jelinek wrote:
> > +	  tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
> > +	  if (b)
> > +	    duplicate_one_attribute (&DECL_ATTRIBUTES (b),
> > +				     DECL_ATTRIBUTES (newdecl),
> > +				     "omp declare simd");
> 
> It occurs to me that we're likely to want to propagate other attributes to
> the builtin, too.  In the testcase, nothrow and leaf also seem appropriate.
> Do we want a broader copy_attributes_to_builtin function, even if it only
> copies this omp attribute for now?

So like this?

2017-10-24  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/81706
	* attribs.c (attribute_value_equal): Use omp_declare_simd_clauses_equal
	for comparison of OMP_CLAUSEs regardless of flag_openmp{,_simd}.
	(duplicate_one_attribute, copy_attributes_to_builtin): New functions.
	* attribs.h (duplicate_one_attribute, copy_attributes_to_builtin): New
	declarations.

	* c-decl.c (merge_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* decl.c (duplicate_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* gcc.target/i386/pr81706.c: New test.
	* g++.dg/ext/pr81706.C: New test.

--- gcc/attribs.c.jj	2017-10-20 16:02:57.013523597 +0200
+++ gcc/attribs.c	2017-10-24 17:28:15.831989968 +0200
@@ -1125,9 +1125,9 @@ attribute_value_equal (const_tree attr1,
 				     TREE_VALUE (attr2)) == 1);
     }
 
-  if ((flag_openmp || flag_openmp_simd)
-      && TREE_VALUE (attr1) && TREE_VALUE (attr2)
+  if (TREE_VALUE (attr1)
       && TREE_CODE (TREE_VALUE (attr1)) == OMP_CLAUSE
+      && TREE_VALUE (attr2)
       && TREE_CODE (TREE_VALUE (attr2)) == OMP_CLAUSE)
     return omp_declare_simd_clauses_equal (TREE_VALUE (attr1),
 					   TREE_VALUE (attr2));
@@ -1322,6 +1322,44 @@ merge_decl_attributes (tree olddecl, tre
 			   DECL_ATTRIBUTES (newdecl));
 }
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+void
+duplicate_one_attribute (tree *attrs, tree attr, const char *name)
+{
+  attr = lookup_attribute (name, attr);
+  if (!attr)
+    return;
+  tree a = lookup_attribute (name, *attrs);
+  while (attr)
+    {
+      tree a2;
+      for (a2 = a; a2; a2 = lookup_attribute (name, TREE_CHAIN (a2)))
+	if (attribute_value_equal (attr, a2))
+	  break;
+      if (!a2)
+	{
+	  a2 = copy_node (attr);
+	  TREE_CHAIN (a2) = *attrs;
+	  *attrs = a2;
+	}
+      attr = lookup_attribute (name, TREE_CHAIN (attr));
+    }
+}
+
+/* Duplicate all attributes with name NAME from DECL to the corresponding
+   builtin.  */
+
+void
+copy_attributes_to_builtin (tree decl, const char *name)
+{
+  tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (decl));
+  if (b)
+    duplicate_one_attribute (&DECL_ATTRIBUTES (b),
+			     DECL_ATTRIBUTES (decl), name);
+}
+
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
 
 /* Specialization of merge_decl_attributes for various Windows targets.
--- gcc/attribs.h.jj	2017-09-29 19:43:38.879904220 +0200
+++ gcc/attribs.h	2017-10-24 17:27:04.821841480 +0200
@@ -77,6 +77,16 @@ extern tree remove_attribute (const char
 
 extern tree merge_attributes (tree, tree);
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+extern void duplicate_one_attribute (tree *, tree, const char *);
+
+/* Duplicate all attributes with name NAME from DECL to the corresponding
+   builtin.  */
+
+extern void copy_attributes_to_builtin (tree, const char *);
+
 /* Given two Windows decl attributes lists, possibly including
    dllimport, return a list of their union .  */
 extern tree merge_dllimport_decl_attributes (tree, tree);
--- gcc/c/c-decl.c.jj	2017-10-11 22:37:48.943949030 +0200
+++ gcc/c/c-decl.c	2017-10-24 17:25:23.107061185 +0200
@@ -2569,6 +2569,8 @@ merge_decls (tree newdecl, tree olddecl,
 			set_builtin_decl_declared_p (fncode, true);
 		      break;
 		    }
+
+		  copy_attributes_to_builtin (newdecl, "omp declare simd");
 		}
 	    }
 	  else
--- gcc/cp/decl.c.jj	2017-10-13 19:02:08.492046895 +0200
+++ gcc/cp/decl.c	2017-10-24 17:29:00.486454497 +0200
@@ -2470,6 +2470,8 @@ next_arg:;
 		  break;
 		}
 	    }
+
+	  copy_attributes_to_builtin (newdecl, "omp declare simd");
 	}
       if (new_defines_function)
 	/* If defining a function declared with other language
--- gcc/testsuite/gcc.target/i386/pr81706.c.jj	2017-10-24 17:24:17.787844456 +0200
+++ gcc/testsuite/gcc.target/i386/pr81706.c	2017-10-24 17:24:17.787844456 +0200
@@ -0,0 +1,32 @@
+/* PR libstdc++/81706 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx2 -mno-avx512f" } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } } */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}
--- gcc/testsuite/g++.dg/ext/pr81706.C.jj	2017-10-24 17:24:17.788844444 +0200
+++ gcc/testsuite/g++.dg/ext/pr81706.C	2017-10-24 17:24:17.787844456 +0200
@@ -0,0 +1,32 @@
+// PR libstdc++/81706
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-options "-O3 -mavx2 -mno-avx512f" }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } }
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}


	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-10-24 15:34               ` Jakub Jelinek
@ 2017-10-24 17:56                 ` Jason Merrill
  2017-10-24 19:36                   ` Jakub Jelinek
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Merrill @ 2017-10-24 17:56 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely,
	gcc-patches List, libstdc++

On Tue, Oct 24, 2017 at 11:33 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Oct 24, 2017 at 11:06:51AM -0400, Jason Merrill wrote:
>> On 09/29/2017 08:32 AM, Jakub Jelinek wrote:
>> > +     tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
>> > +     if (b)
>> > +       duplicate_one_attribute (&DECL_ATTRIBUTES (b),
>> > +                                DECL_ATTRIBUTES (newdecl),
>> > +                                "omp declare simd");
>>
>> It occurs to me that we're likely to want to propagate other attributes to
>> the builtin, too.  In the testcase, nothrow and leaf also seem appropriate.
>> Do we want a broader copy_attributes_to_builtin function, even if it only
>> copies this omp attribute for now?
>
> So like this?

I was thinking that the new function would decide which attributes we
want to copy over, rather than have a "name" parameter.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-10-24 17:56                 ` Jason Merrill
@ 2017-10-24 19:36                   ` Jakub Jelinek
  2017-10-24 20:26                     ` Jason Merrill
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2017-10-24 19:36 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely,
	gcc-patches List, libstdc++

On Tue, Oct 24, 2017 at 01:55:58PM -0400, Jason Merrill wrote:
> On Tue, Oct 24, 2017 at 11:33 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Tue, Oct 24, 2017 at 11:06:51AM -0400, Jason Merrill wrote:
> >> On 09/29/2017 08:32 AM, Jakub Jelinek wrote:
> >> > +     tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
> >> > +     if (b)
> >> > +       duplicate_one_attribute (&DECL_ATTRIBUTES (b),
> >> > +                                DECL_ATTRIBUTES (newdecl),
> >> > +                                "omp declare simd");
> >>
> >> It occurs to me that we're likely to want to propagate other attributes to
> >> the builtin, too.  In the testcase, nothrow and leaf also seem appropriate.
> >> Do we want a broader copy_attributes_to_builtin function, even if it only
> >> copies this omp attribute for now?
> >
> > So like this?
> 
> I was thinking that the new function would decide which attributes we
> want to copy over, rather than have a "name" parameter.

Like this then?

2017-10-24  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/81706
	* attribs.c (attribute_value_equal): Use omp_declare_simd_clauses_equal
	for comparison of OMP_CLAUSEs regardless of flag_openmp{,_simd}.
	(duplicate_one_attribute, copy_attributes_to_builtin): New functions.
	* attribs.h (duplicate_one_attribute, copy_attributes_to_builtin): New
	declarations.

	* c-decl.c (merge_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* decl.c (duplicate_decls): Copy "omp declare simd" attributes from
	newdecl to corresponding __builtin_ if any.

	* gcc.target/i386/pr81706.c: New test.
	* g++.dg/ext/pr81706.C: New test.

--- gcc/attribs.c.jj	2017-10-20 16:02:57.013523597 +0200
+++ gcc/attribs.c	2017-10-24 21:22:34.434131849 +0200
@@ -1125,9 +1125,9 @@ attribute_value_equal (const_tree attr1,
 				     TREE_VALUE (attr2)) == 1);
     }
 
-  if ((flag_openmp || flag_openmp_simd)
-      && TREE_VALUE (attr1) && TREE_VALUE (attr2)
+  if (TREE_VALUE (attr1)
       && TREE_CODE (TREE_VALUE (attr1)) == OMP_CLAUSE
+      && TREE_VALUE (attr2)
       && TREE_CODE (TREE_VALUE (attr2)) == OMP_CLAUSE)
     return omp_declare_simd_clauses_equal (TREE_VALUE (attr1),
 					   TREE_VALUE (attr2));
@@ -1322,6 +1322,44 @@ merge_decl_attributes (tree olddecl, tre
 			   DECL_ATTRIBUTES (newdecl));
 }
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+void
+duplicate_one_attribute (tree *attrs, tree attr, const char *name)
+{
+  attr = lookup_attribute (name, attr);
+  if (!attr)
+    return;
+  tree a = lookup_attribute (name, *attrs);
+  while (attr)
+    {
+      tree a2;
+      for (a2 = a; a2; a2 = lookup_attribute (name, TREE_CHAIN (a2)))
+	if (attribute_value_equal (attr, a2))
+	  break;
+      if (!a2)
+	{
+	  a2 = copy_node (attr);
+	  TREE_CHAIN (a2) = *attrs;
+	  *attrs = a2;
+	}
+      attr = lookup_attribute (name, TREE_CHAIN (attr));
+    }
+}
+
+/* Duplicate all attributes from user DECL to the corresponding
+   builtin that should be propagated.  */
+
+void
+copy_attributes_to_builtin (tree decl)
+{
+  tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (decl));
+  if (b)
+    duplicate_one_attribute (&DECL_ATTRIBUTES (b),
+			     DECL_ATTRIBUTES (decl), "omp declare simd");
+}
+
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
 
 /* Specialization of merge_decl_attributes for various Windows targets.
--- gcc/attribs.h.jj	2017-09-29 19:43:38.879904220 +0200
+++ gcc/attribs.h	2017-10-24 21:22:48.158966401 +0200
@@ -77,6 +77,16 @@ extern tree remove_attribute (const char
 
 extern tree merge_attributes (tree, tree);
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+extern void duplicate_one_attribute (tree *, tree, const char *);
+
+/* Duplicate all attributes from user DECL to the corresponding
+   builtin that should be propagated.  */
+
+extern void copy_attributes_to_builtin (tree);
+
 /* Given two Windows decl attributes lists, possibly including
    dllimport, return a list of their union .  */
 extern tree merge_dllimport_decl_attributes (tree, tree);
--- gcc/c/c-decl.c.jj	2017-10-11 22:37:48.943949030 +0200
+++ gcc/c/c-decl.c	2017-10-24 21:21:34.771851055 +0200
@@ -2569,6 +2569,8 @@ merge_decls (tree newdecl, tree olddecl,
 			set_builtin_decl_declared_p (fncode, true);
 		      break;
 		    }
+
+		  copy_attributes_to_builtin (newdecl);
 		}
 	    }
 	  else
--- gcc/cp/decl.c.jj	2017-10-13 19:02:08.492046895 +0200
+++ gcc/cp/decl.c	2017-10-24 21:21:07.434180598 +0200
@@ -2470,6 +2470,8 @@ next_arg:;
 		  break;
 		}
 	    }
+
+	  copy_attributes_to_builtin (newdecl);
 	}
       if (new_defines_function)
 	/* If defining a function declared with other language
--- gcc/testsuite/gcc.target/i386/pr81706.c.jj	2017-10-24 17:24:17.787844456 +0200
+++ gcc/testsuite/gcc.target/i386/pr81706.c	2017-10-24 17:24:17.787844456 +0200
@@ -0,0 +1,32 @@
+/* PR libstdc++/81706 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx2 -mno-avx512f" } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } } */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}
--- gcc/testsuite/g++.dg/ext/pr81706.C.jj	2017-10-24 17:24:17.788844444 +0200
+++ gcc/testsuite/g++.dg/ext/pr81706.C	2017-10-24 17:24:17.787844456 +0200
@@ -0,0 +1,32 @@
+// PR libstdc++/81706
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-options "-O3 -mavx2 -mno-avx512f" }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } }
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}


	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)
  2017-10-24 19:36                   ` Jakub Jelinek
@ 2017-10-24 20:26                     ` Jason Merrill
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Merrill @ 2017-10-24 20:26 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Joseph S. Myers, Marek Polacek, Jonathan Wakely,
	gcc-patches List, libstdc++

On Tue, Oct 24, 2017 at 3:36 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Oct 24, 2017 at 01:55:58PM -0400, Jason Merrill wrote:
>> On Tue, Oct 24, 2017 at 11:33 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>> > On Tue, Oct 24, 2017 at 11:06:51AM -0400, Jason Merrill wrote:
>> >> On 09/29/2017 08:32 AM, Jakub Jelinek wrote:
>> >> > +     tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
>> >> > +     if (b)
>> >> > +       duplicate_one_attribute (&DECL_ATTRIBUTES (b),
>> >> > +                                DECL_ATTRIBUTES (newdecl),
>> >> > +                                "omp declare simd");
>> >>
>> >> It occurs to me that we're likely to want to propagate other attributes to
>> >> the builtin, too.  In the testcase, nothrow and leaf also seem appropriate.
>> >> Do we want a broader copy_attributes_to_builtin function, even if it only
>> >> copies this omp attribute for now?
>> >
>> > So like this?
>>
>> I was thinking that the new function would decide which attributes we
>> want to copy over, rather than have a "name" parameter.
>
> Like this then?

Yes, thanks.  This patch is OK.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-10-24 20:26 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-07  9:08 [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706) Jakub Jelinek
2017-08-07 14:54 ` Jason Merrill
2017-08-07 15:28   ` Jakub Jelinek
2017-08-07 20:59     ` Jonathan Wakely
2017-08-07 21:02       ` Jakub Jelinek
2017-08-07 21:58         ` Jonathan Wakely
2017-09-01 11:13     ` Jakub Jelinek
2017-09-09 13:43       ` Jason Merrill
2017-09-12  7:49         ` Jakub Jelinek
2017-09-29 12:32           ` Jakub Jelinek
2017-09-29 20:17             ` Joseph Myers
2017-10-24 15:06             ` Jason Merrill
2017-10-24 15:34               ` Jakub Jelinek
2017-10-24 17:56                 ` Jason Merrill
2017-10-24 19:36                   ` Jakub Jelinek
2017-10-24 20:26                     ` Jason Merrill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).