public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* inhibit the sincos optimization when the target has sin and cos instructions
@ 2016-05-13 19:19 Cesar Philippidis
  2016-05-13 19:58 ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-13 19:19 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 506 bytes --]

The cse_sincos pass tries to optimize sequences such as

  sin (x);
  cos (x);

into a single call to sincos, or cexpi, when available. However, the
nvptx target has sin and cos instructions, albeit with some loss of
precision (so it's only enabled with -ffast-math). This patch teaches
cse_sincos pass to ignore sin, cos and cexpi instructions when the
target can expand those calls. This yields a 6x speedup in 314.omriq
from spec accel when running on Nvidia accelerators.

Is this OK for trunk?

Cesar

[-- Attachment #2: inhibit-sincos.diff --]
[-- Type: text/x-patch, Size: 1473 bytes --]

2016-05-13  Cesar Philippidis  <cesar@codesourcery.com>

	gcc/
	* tree-ssa-math-opts.c (pass_cse_sincos::execute): Don't optimize
	sin and cos calls when the target has instructions for them.
	
	gcc/testsuite/
	* gcc.target/nvptx/sincos.c: New test.


diff --git a/gcc/testsuite/gcc.target/nvptx/sincos.c b/gcc/testsuite/gcc.target/nvptx/sincos.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+  float s = sinf (x);
+  float c = cosf (x);
+
+  return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 81688cd..38051e1 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1806,6 +1806,11 @@ pass_cse_sincos::execute (function *fun)
 		CASE_CFN_COS:
 		CASE_CFN_SIN:
 		CASE_CFN_CEXPI:
+		  /* Don't modify these calls if they can be translated
+		     directly into hardware instructions.  */
+		  if (replacement_internal_fn (as_a <gcall *> (stmt))
+		      != IFN_LAST)
+		    break;
 		  /* Make sure we have either sincos or cexp.  */
 		  if (!targetm.libc_has_function (function_c99_math_complex)
 		      && !targetm.libc_has_function (function_sincos))

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-13 19:19 inhibit the sincos optimization when the target has sin and cos instructions Cesar Philippidis
@ 2016-05-13 19:58 ` Richard Biener
  2016-05-13 20:13   ` Andrew Pinski
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2016-05-13 19:58 UTC (permalink / raw)
  To: Cesar Philippidis, gcc-patches

On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>The cse_sincos pass tries to optimize sequences such as
>
>  sin (x);
>  cos (x);
>
>into a single call to sincos, or cexpi, when available. However, the
>nvptx target has sin and cos instructions, albeit with some loss of
>precision (so it's only enabled with -ffast-math). This patch teaches
>cse_sincos pass to ignore sin, cos and cexpi instructions when the
>target can expand those calls. This yields a 6x speedup in 314.omriq
>from spec accel when running on Nvidia accelerators.
>
>Is this OK for trunk?

Isn't there an optab for sincos?  ISTR x87 handles this pass just fine and also can do sin and cos.

Richard.

>Cesar


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-13 19:58 ` Richard Biener
@ 2016-05-13 20:13   ` Andrew Pinski
  2016-05-17 21:10     ` Cesar Philippidis
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Pinski @ 2016-05-13 20:13 UTC (permalink / raw)
  To: Richard Biener; +Cc: Cesar Philippidis, gcc-patches

On Fri, May 13, 2016 at 12:58 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>>The cse_sincos pass tries to optimize sequences such as
>>
>>  sin (x);
>>  cos (x);
>>
>>into a single call to sincos, or cexpi, when available. However, the
>>nvptx target has sin and cos instructions, albeit with some loss of
>>precision (so it's only enabled with -ffast-math). This patch teaches
>>cse_sincos pass to ignore sin, cos and cexpi instructions when the
>>target can expand those calls. This yields a 6x speedup in 314.omriq
>>from spec accel when running on Nvidia accelerators.
>>
>>Is this OK for trunk?
>
> Isn't there an optab for sincos?

This is exactly what I was going to suggest.  This transformation
should be done in the back-end back to sin/cos instructions.

Thanks,
Andrew

> ISTR x87 handles this pass just fine and also can do sin and cos.
>
> Richard.
>
>>Cesar
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-13 20:13   ` Andrew Pinski
@ 2016-05-17 21:10     ` Cesar Philippidis
  2016-05-17 21:23       ` Andrew Pinski
  0 siblings, 1 reply; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-17 21:10 UTC (permalink / raw)
  To: Andrew Pinski, Richard Biener, Nathan Sidwell; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5247 bytes --]

On 05/13/2016 01:13 PM, Andrew Pinski wrote:
> On Fri, May 13, 2016 at 12:58 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>>> The cse_sincos pass tries to optimize sequences such as
>>>
>>>  sin (x);
>>>  cos (x);
>>>
>>> into a single call to sincos, or cexpi, when available. However, the
>>> nvptx target has sin and cos instructions, albeit with some loss of
>>> precision (so it's only enabled with -ffast-math). This patch teaches
>>> cse_sincos pass to ignore sin, cos and cexpi instructions when the
>>> target can expand those calls. This yields a 6x speedup in 314.omriq
>> >from spec accel when running on Nvidia accelerators.
>>>
>>> Is this OK for trunk?
>>
>> Isn't there an optab for sincos?
> 
> This is exactly what I was going to suggest.  This transformation
> should be done in the back-end back to sin/cos instructions.

I didn't realize that the 387 has sin, cos and sincos instructions,
so yeah, my original patch is bad.

Nathan, is this patch ok for trunk and gcc-6? It adds a new sincos 
pattern in the nvptx backend. I haven't testing a standalone nvptx 
toolchain prior to this patch, so I'm not sure if my test results 
look sane. I seem to be getting a different set of failures when I
test a clean trunk build multiple times. I attached my results
below for reference.

Cesar

g++.sum
Tests that now fail, but worked before:

nvptx-none-run: g++.dg/abi/param1.C  -std=c++14 execution test

Tests that now work, but didn't before:

nvptx-none-run: g++.dg/opt/pr30590.C  -std=gnu++98 execution test
nvptx-none-run: g++.dg/opt/pr36187.C  -std=gnu++14 execution test

gfortran.sum
Tests that now fail, but worked before:

nvptx-none-run: gfortran.dg/alloc_comp_assign_10.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
nvptx-none-run: gfortran.dg/allocate_with_source_5.f90   -O1  execution test
nvptx-none-run: gfortran.dg/func_assign_3.f90   -O3 -g  execution test
nvptx-none-run: gfortran.dg/inline_sum_3.f90   -O1  execution test
nvptx-none-run: gfortran.dg/inline_sum_3.f90   -O3 -g  execution test
nvptx-none-run: gfortran.dg/internal_pack_15.f90   -O2  execution test
nvptx-none-run: gfortran.dg/internal_pack_8.f90   -Os  execution test
nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90   -O0  execution test
nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
nvptx-none-run: gfortran.dg/intrinsic_pack_5.f90   -O3 -g  execution test
nvptx-none-run: gfortran.dg/intrinsic_product_1.f90   -O1  execution test
nvptx-none-run: gfortran.dg/intrinsic_verify_1.f90   -O3 -g  execution test
nvptx-none-run: gfortran.dg/is_iostat_end_eor_1.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
nvptx-none-run: gfortran.dg/iso_c_binding_rename_1.f03   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test

Tests that now work, but didn't before:

nvptx-none-run: gfortran.dg/char_pointer_assign.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
nvptx-none-run: gfortran.dg/char_pointer_dummy.f90   -O1  execution test
nvptx-none-run: gfortran.dg/char_pointer_dummy.f90   -Os  execution test
nvptx-none-run: gfortran.dg/char_result_13.f90   -O3 -g  execution test
nvptx-none-run: gfortran.dg/char_result_2.f90   -O1  execution test
nvptx-none-run: gfortran.dg/char_type_len.f90   -Os  execution test
nvptx-none-run: gfortran.dg/character_array_constructor_1.f90   -O0  execution test
nvptx-none-run: gfortran.dg/nested_allocatables_1.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test

gcc.sum
Tests that now fail, but worked before:

nvptx-none-run: gcc.c-torture/execute/20100316-1.c   -Os  execution test
nvptx-none-run: gcc.c-torture/execute/20100708-1.c   -O1  execution test
nvptx-none-run: gcc.c-torture/execute/20100805-1.c   -O0  execution test
nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -g  execution test

Tests that now work, but didn't before:

nvptx-none-run: gcc.c-torture/execute/20091229-1.c   -O3 -g  execution test
nvptx-none-run: gcc.c-torture/execute/20101013-1.c   -Os  execution test
nvptx-none-run: gcc.c-torture/execute/20101025-1.c   -Os  execution test
nvptx-none-run: gcc.c-torture/execute/20120105-1.c   -O0  execution test
nvptx-none-run: gcc.c-torture/execute/20120111-1.c   -O0  execution test

New tests that PASS:

nvptx-none-run: gcc.target/nvptx/sincos-1.c (test for excess errors)
nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times cos.approx.f32 1
nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times sin.approx.f32 1
nvptx-none-run: gcc.target/nvptx/sincos-2.c (test for excess errors)
nvptx-none-run: gcc.target/nvptx/sincos-2.c execution test


>> ISTR x87 handles this pass just fine and also can do sin and cos.
>>
>> Richard.
>>
>>> Cesar
>>
>>


[-- Attachment #2: nvptx-sincos-20160517.diff --]
[-- Type: text/x-patch, Size: 2398 bytes --]

2016-05-17  Cesar Philippidis  <cesar@codesourcery.com>

	gcc/
	* config/nvptx/nvptx.md (unspec): Add UNSPEC_SINCOS.
	(sincossf3): New pattern.

	gcc/testsuite/
	* gcc.target/nvptx/sincos-1.c: New test.
	* gcc.target/nvptx/sincos-2.c: New test.


diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..03a2f67 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -26,6 +26,7 @@
    UNSPEC_EXP2
    UNSPEC_SIN
    UNSPEC_COS
+   UNSPEC_SINCOS
 
    UNSPEC_FPINT_FLOOR
    UNSPEC_FPINT_BTRUNC
@@ -794,6 +795,20 @@
   ""
   "%.\\tsqrt%#%t0\\t%0, %1;")
 
+(define_expand "sincossf3"
+  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+	           UNSPEC_COS))
+   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+  "flag_unsafe_math_optimizations"
+{
+  emit_insn (gen_sinsf2 (operands[1], operands[2]));
+  emit_insn (gen_cossf2 (operands[0], operands[2]));
+
+  DONE;
+})
+
 (define_insn "sinsf2"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
 	(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-1.c b/gcc/testsuite/gcc.target/nvptx/sincos-1.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+  float s = sinf (x);
+  float c = cosf (x);
+
+  return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-2.c b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
new file mode 100644
index 0000000..b617a7c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math" } */
+
+#include <assert.h>
+
+extern float sinf (float);
+extern float cosf (float);
+
+float val = 1.0;
+
+float
+test_sincos (float x, float other_cos)
+{
+  float s = sinf (x);
+  float c = cosf (x);
+
+  assert (c == other_cos);
+  
+  return s + c;
+}
+
+int
+main ()
+{
+  float c = cosf (val);
+  
+  test_sincos (val, c);
+ 
+  return 0;
+}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-17 21:10     ` Cesar Philippidis
@ 2016-05-17 21:23       ` Andrew Pinski
  2016-05-17 21:30         ` Cesar Philippidis
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Pinski @ 2016-05-17 21:23 UTC (permalink / raw)
  To: Cesar Philippidis; +Cc: Richard Biener, Nathan Sidwell, gcc-patches

On Tue, May 17, 2016 at 2:10 PM, Cesar Philippidis
<cesar@codesourcery.com> wrote:
> On 05/13/2016 01:13 PM, Andrew Pinski wrote:
>> On Fri, May 13, 2016 at 12:58 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>>>> The cse_sincos pass tries to optimize sequences such as
>>>>
>>>>  sin (x);
>>>>  cos (x);
>>>>
>>>> into a single call to sincos, or cexpi, when available. However, the
>>>> nvptx target has sin and cos instructions, albeit with some loss of
>>>> precision (so it's only enabled with -ffast-math). This patch teaches
>>>> cse_sincos pass to ignore sin, cos and cexpi instructions when the
>>>> target can expand those calls. This yields a 6x speedup in 314.omriq
>>> >from spec accel when running on Nvidia accelerators.
>>>>
>>>> Is this OK for trunk?
>>>
>>> Isn't there an optab for sincos?
>>
>> This is exactly what I was going to suggest.  This transformation
>> should be done in the back-end back to sin/cos instructions.
>
> I didn't realize that the 387 has sin, cos and sincos instructions,
> so yeah, my original patch is bad.
>
> Nathan, is this patch ok for trunk and gcc-6? It adds a new sincos
> pattern in the nvptx backend. I haven't testing a standalone nvptx
> toolchain prior to this patch, so I'm not sure if my test results
> look sane. I seem to be getting a different set of failures when I
> test a clean trunk build multiple times. I attached my results
> below for reference.


UNSPEC_SINCOS is unused so why add it?

Thanks,
Andrew Pinski


>
> Cesar
>
> g++.sum
> Tests that now fail, but worked before:
>
> nvptx-none-run: g++.dg/abi/param1.C  -std=c++14 execution test
>
> Tests that now work, but didn't before:
>
> nvptx-none-run: g++.dg/opt/pr30590.C  -std=gnu++98 execution test
> nvptx-none-run: g++.dg/opt/pr36187.C  -std=gnu++14 execution test
>
> gfortran.sum
> Tests that now fail, but worked before:
>
> nvptx-none-run: gfortran.dg/alloc_comp_assign_10.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> nvptx-none-run: gfortran.dg/allocate_with_source_5.f90   -O1  execution test
> nvptx-none-run: gfortran.dg/func_assign_3.f90   -O3 -g  execution test
> nvptx-none-run: gfortran.dg/inline_sum_3.f90   -O1  execution test
> nvptx-none-run: gfortran.dg/inline_sum_3.f90   -O3 -g  execution test
> nvptx-none-run: gfortran.dg/internal_pack_15.f90   -O2  execution test
> nvptx-none-run: gfortran.dg/internal_pack_8.f90   -Os  execution test
> nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90   -O0  execution test
> nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> nvptx-none-run: gfortran.dg/intrinsic_pack_5.f90   -O3 -g  execution test
> nvptx-none-run: gfortran.dg/intrinsic_product_1.f90   -O1  execution test
> nvptx-none-run: gfortran.dg/intrinsic_verify_1.f90   -O3 -g  execution test
> nvptx-none-run: gfortran.dg/is_iostat_end_eor_1.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> nvptx-none-run: gfortran.dg/iso_c_binding_rename_1.f03   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>
> Tests that now work, but didn't before:
>
> nvptx-none-run: gfortran.dg/char_pointer_assign.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> nvptx-none-run: gfortran.dg/char_pointer_dummy.f90   -O1  execution test
> nvptx-none-run: gfortran.dg/char_pointer_dummy.f90   -Os  execution test
> nvptx-none-run: gfortran.dg/char_result_13.f90   -O3 -g  execution test
> nvptx-none-run: gfortran.dg/char_result_2.f90   -O1  execution test
> nvptx-none-run: gfortran.dg/char_type_len.f90   -Os  execution test
> nvptx-none-run: gfortran.dg/character_array_constructor_1.f90   -O0  execution test
> nvptx-none-run: gfortran.dg/nested_allocatables_1.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>
> gcc.sum
> Tests that now fail, but worked before:
>
> nvptx-none-run: gcc.c-torture/execute/20100316-1.c   -Os  execution test
> nvptx-none-run: gcc.c-torture/execute/20100708-1.c   -O1  execution test
> nvptx-none-run: gcc.c-torture/execute/20100805-1.c   -O0  execution test
> nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -g  execution test
>
> Tests that now work, but didn't before:
>
> nvptx-none-run: gcc.c-torture/execute/20091229-1.c   -O3 -g  execution test
> nvptx-none-run: gcc.c-torture/execute/20101013-1.c   -Os  execution test
> nvptx-none-run: gcc.c-torture/execute/20101025-1.c   -Os  execution test
> nvptx-none-run: gcc.c-torture/execute/20120105-1.c   -O0  execution test
> nvptx-none-run: gcc.c-torture/execute/20120111-1.c   -O0  execution test
>
> New tests that PASS:
>
> nvptx-none-run: gcc.target/nvptx/sincos-1.c (test for excess errors)
> nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times cos.approx.f32 1
> nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times sin.approx.f32 1
> nvptx-none-run: gcc.target/nvptx/sincos-2.c (test for excess errors)
> nvptx-none-run: gcc.target/nvptx/sincos-2.c execution test
>
>
>>> ISTR x87 handles this pass just fine and also can do sin and cos.
>>>
>>> Richard.
>>>
>>>> Cesar
>>>
>>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-17 21:23       ` Andrew Pinski
@ 2016-05-17 21:30         ` Cesar Philippidis
  2016-05-18 12:29           ` Nathan Sidwell
  0 siblings, 1 reply; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-17 21:30 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Richard Biener, Nathan Sidwell, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5811 bytes --]

On 05/17/2016 02:22 PM, Andrew Pinski wrote:
> On Tue, May 17, 2016 at 2:10 PM, Cesar Philippidis
> <cesar@codesourcery.com> wrote:
>> On 05/13/2016 01:13 PM, Andrew Pinski wrote:
>>> On Fri, May 13, 2016 at 12:58 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>>>>> The cse_sincos pass tries to optimize sequences such as
>>>>>
>>>>>  sin (x);
>>>>>  cos (x);
>>>>>
>>>>> into a single call to sincos, or cexpi, when available. However, the
>>>>> nvptx target has sin and cos instructions, albeit with some loss of
>>>>> precision (so it's only enabled with -ffast-math). This patch teaches
>>>>> cse_sincos pass to ignore sin, cos and cexpi instructions when the
>>>>> target can expand those calls. This yields a 6x speedup in 314.omriq
>>>> >from spec accel when running on Nvidia accelerators.
>>>>>
>>>>> Is this OK for trunk?
>>>>
>>>> Isn't there an optab for sincos?
>>>
>>> This is exactly what I was going to suggest.  This transformation
>>> should be done in the back-end back to sin/cos instructions.
>>
>> I didn't realize that the 387 has sin, cos and sincos instructions,
>> so yeah, my original patch is bad.
>>
>> Nathan, is this patch ok for trunk and gcc-6? It adds a new sincos
>> pattern in the nvptx backend. I haven't testing a standalone nvptx
>> toolchain prior to this patch, so I'm not sure if my test results
>> look sane. I seem to be getting a different set of failures when I
>> test a clean trunk build multiple times. I attached my results
>> below for reference.
> 
> 
> UNSPEC_SINCOS is unused so why add it?

Good eyes, thanks! I thought I had to create a new insn, but I got away
with an expand. I attached the updated patch.

Cesar

>> g++.sum
>> Tests that now fail, but worked before:
>>
>> nvptx-none-run: g++.dg/abi/param1.C  -std=c++14 execution test
>>
>> Tests that now work, but didn't before:
>>
>> nvptx-none-run: g++.dg/opt/pr30590.C  -std=gnu++98 execution test
>> nvptx-none-run: g++.dg/opt/pr36187.C  -std=gnu++14 execution test
>>
>> gfortran.sum
>> Tests that now fail, but worked before:
>>
>> nvptx-none-run: gfortran.dg/alloc_comp_assign_10.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>> nvptx-none-run: gfortran.dg/allocate_with_source_5.f90   -O1  execution test
>> nvptx-none-run: gfortran.dg/func_assign_3.f90   -O3 -g  execution test
>> nvptx-none-run: gfortran.dg/inline_sum_3.f90   -O1  execution test
>> nvptx-none-run: gfortran.dg/inline_sum_3.f90   -O3 -g  execution test
>> nvptx-none-run: gfortran.dg/internal_pack_15.f90   -O2  execution test
>> nvptx-none-run: gfortran.dg/internal_pack_8.f90   -Os  execution test
>> nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90   -O0  execution test
>> nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>> nvptx-none-run: gfortran.dg/intrinsic_pack_5.f90   -O3 -g  execution test
>> nvptx-none-run: gfortran.dg/intrinsic_product_1.f90   -O1  execution test
>> nvptx-none-run: gfortran.dg/intrinsic_verify_1.f90   -O3 -g  execution test
>> nvptx-none-run: gfortran.dg/is_iostat_end_eor_1.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>> nvptx-none-run: gfortran.dg/iso_c_binding_rename_1.f03   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>>
>> Tests that now work, but didn't before:
>>
>> nvptx-none-run: gfortran.dg/char_pointer_assign.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>> nvptx-none-run: gfortran.dg/char_pointer_dummy.f90   -O1  execution test
>> nvptx-none-run: gfortran.dg/char_pointer_dummy.f90   -Os  execution test
>> nvptx-none-run: gfortran.dg/char_result_13.f90   -O3 -g  execution test
>> nvptx-none-run: gfortran.dg/char_result_2.f90   -O1  execution test
>> nvptx-none-run: gfortran.dg/char_type_len.f90   -Os  execution test
>> nvptx-none-run: gfortran.dg/character_array_constructor_1.f90   -O0  execution test
>> nvptx-none-run: gfortran.dg/nested_allocatables_1.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>>
>> gcc.sum
>> Tests that now fail, but worked before:
>>
>> nvptx-none-run: gcc.c-torture/execute/20100316-1.c   -Os  execution test
>> nvptx-none-run: gcc.c-torture/execute/20100708-1.c   -O1  execution test
>> nvptx-none-run: gcc.c-torture/execute/20100805-1.c   -O0  execution test
>> nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>> nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -g  execution test
>>
>> Tests that now work, but didn't before:
>>
>> nvptx-none-run: gcc.c-torture/execute/20091229-1.c   -O3 -g  execution test
>> nvptx-none-run: gcc.c-torture/execute/20101013-1.c   -Os  execution test
>> nvptx-none-run: gcc.c-torture/execute/20101025-1.c   -Os  execution test
>> nvptx-none-run: gcc.c-torture/execute/20120105-1.c   -O0  execution test
>> nvptx-none-run: gcc.c-torture/execute/20120111-1.c   -O0  execution test
>>
>> New tests that PASS:
>>
>> nvptx-none-run: gcc.target/nvptx/sincos-1.c (test for excess errors)
>> nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times cos.approx.f32 1
>> nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times sin.approx.f32 1
>> nvptx-none-run: gcc.target/nvptx/sincos-2.c (test for excess errors)
>> nvptx-none-run: gcc.target/nvptx/sincos-2.c execution test
>>
>>
>>>> ISTR x87 handles this pass just fine and also can do sin and cos.
>>>>
>>>> Richard.
>>>>
>>>>> Cesar
>>>>
>>>>
>>


[-- Attachment #2: nvptx-sincos-20160517b.diff --]
[-- Type: text/x-patch, Size: 2237 bytes --]

2016-05-17  Cesar Philippidis  <cesar@codesourcery.com>

	gcc/
	* config/nvptx/nvptx.md (sincossf3): New pattern.

	gcc/testsuite/
	* gcc.target/nvptx/sincos-1.c: New test.
	* gcc.target/nvptx/sincos-2.c: New test.


diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..29642ec 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -794,6 +794,20 @@
   ""
   "%.\\tsqrt%#%t0\\t%0, %1;")
 
+(define_expand "sincossf3"
+  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+	           UNSPEC_COS))
+   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+  "flag_unsafe_math_optimizations"
+{
+  emit_insn (gen_sinsf2 (operands[1], operands[2]));
+  emit_insn (gen_cossf2 (operands[0], operands[2]));
+
+  DONE;
+})
+
 (define_insn "sinsf2"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
 	(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-1.c b/gcc/testsuite/gcc.target/nvptx/sincos-1.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+  float s = sinf (x);
+  float c = cosf (x);
+
+  return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-2.c b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
new file mode 100644
index 0000000..b617a7c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math" } */
+
+#include <assert.h>
+
+extern float sinf (float);
+extern float cosf (float);
+
+float val = 1.0;
+
+float
+test_sincos (float x, float other_cos)
+{
+  float s = sinf (x);
+  float c = cosf (x);
+
+  assert (c == other_cos);
+  
+  return s + c;
+}
+
+int
+main ()
+{
+  float c = cosf (val);
+  
+  test_sincos (val, c);
+ 
+  return 0;
+}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-17 21:30         ` Cesar Philippidis
@ 2016-05-18 12:29           ` Nathan Sidwell
  2016-05-19  3:43             ` Cesar Philippidis
  0 siblings, 1 reply; 12+ messages in thread
From: Nathan Sidwell @ 2016-05-18 12:29 UTC (permalink / raw)
  To: Cesar Philippidis, Andrew Pinski; +Cc: Richard Biener, gcc-patches

On 05/17/16 17:30, Cesar Philippidis wrote:
> On 05/17/2016 02:22 PM, Andrew Pinski wrote:

> Good eyes, thanks! I thought I had to create a new insn, but I got away
> with an expand. I attached the updated patch.
>
> Cesar

>>> gcc.sum
>>> Tests that now fail, but worked before:
>>>
>>> nvptx-none-run: gcc.c-torture/execute/20100316-1.c   -Os  execution test
>>> nvptx-none-run: gcc.c-torture/execute/20100708-1.c   -O1  execution test
>>> nvptx-none-run: gcc.c-torture/execute/20100805-1.c   -O0  execution test
>>> nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>>> nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -g  execution test
>>>

Please determine why these now fail.

> +(define_expand "sincossf3"
> +  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
> +	(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
> +	           UNSPEC_COS))
> +   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
> +	(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
> +  "flag_unsafe_math_optimizations"
> +{
> +  emit_insn (gen_sinsf2 (operands[1], operands[2]));
> +  emit_insn (gen_cossf2 (operands[0], operands[2]));
> +
> +  DONE;
> +})

Why the emit_insn code?  that seems to be replicating the RTL representation -- 
you're saying the same thing twice.

Doesn't operands[2] need (conditionally) copying to a new register -- what if it 
aliases operands[1]?

> +++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
> @@ -0,0 +1,30 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math" } */
> +

What is this test trying to test?  I'm puzzled by it.  (btw, don't use assert, 
either abort, exit(1) or return from main.)

nathan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-18 12:29           ` Nathan Sidwell
@ 2016-05-19  3:43             ` Cesar Philippidis
  2016-05-19 11:29               ` Alexander Monakov
  2016-05-19 23:12               ` Nathan Sidwell
  0 siblings, 2 replies; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-19  3:43 UTC (permalink / raw)
  To: Nathan Sidwell, Andrew Pinski; +Cc: Richard Biener, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2169 bytes --]

On 05/18/2016 05:29 AM, Nathan Sidwell wrote:
> On 05/17/16 17:30, Cesar Philippidis wrote:
>> On 05/17/2016 02:22 PM, Andrew Pinski wrote:

>>>> gcc.sum
>>>> Tests that now fail, but worked before:
>>>>
>>>> nvptx-none-run: gcc.c-torture/execute/20100316-1.c   -Os  execution
>>>> test
>>>> nvptx-none-run: gcc.c-torture/execute/20100708-1.c   -O1  execution
>>>> test
>>>> nvptx-none-run: gcc.c-torture/execute/20100805-1.c   -O0  execution
>>>> test
>>>> nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer
>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
>>>> nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -g  execution test
>>>>
> 
> Please determine why these now fail.

Those were failing intermittently, at least on my desktop. I'll look
into that it next.

>> +(define_expand "sincossf3"
>> +  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
>> +    (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
>> +               UNSPEC_COS))
>> +   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
>> +    (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
>> +  "flag_unsafe_math_optimizations"
>> +{
>> +  emit_insn (gen_sinsf2 (operands[1], operands[2]));
>> +  emit_insn (gen_cossf2 (operands[0], operands[2]));
>> +
>> +  DONE;
>> +})
> 
> Why the emit_insn code?  that seems to be replicating the RTL
> representation -- you're saying the same thing twice.
> 
> Doesn't operands[2] need (conditionally) copying to a new register --
> what if it aliases operands[1]?

This patch does that now.

>> +++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
>> @@ -0,0 +1,30 @@
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 -ffast-math" } */
>> +
> 
> What is this test trying to test?  I'm puzzled by it.  (btw, don't use
> assert, either abort, exit(1) or return from main.)

My intent was to verify that I got the sin and cos arguments right,
i.e., make sure that this sincos expansion didn't mix up sin(x) with
cos(x). I guess I can create a test that uses vprintf and scans
dg-output for the proper results. But in this patch I just omitted that
test case altogether.

Is this patch ok for trunk?

Cesar


[-- Attachment #2: nvptx-sincos-20160518.diff --]
[-- Type: text/x-patch, Size: 1724 bytes --]

2016-05-18  Cesar Philippidis  <cesar@codesourcery.com>

	gcc/
	* config/nvptx/nvptx.md (sincossf3): New pattern.

	gcc/testsuite/
	* gcc.target/nvptx/sincos.c: New test.


diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..69bbb22 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -794,6 +794,24 @@
   ""
   "%.\\tsqrt%#%t0\\t%0, %1;")
 
+(define_expand "sincossf3"
+  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+	           UNSPEC_COS))
+   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+  "flag_unsafe_math_optimizations"
+{
+  if (REGNO (operands[0]) == REGNO (operands[2]))
+    {
+      rtx tmp = gen_reg_rtx (GET_MODE (operands[2]));
+      emit_insn (gen_rtx_SET (tmp, operands[2]));
+      emit_insn (gen_sinsf2 (operands[1], tmp));
+      emit_insn (gen_cossf2 (operands[0], tmp));
+      DONE;
+    }
+})
+
 (define_insn "sinsf2"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
 	(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos.c b/gcc/testsuite/gcc.target/nvptx/sincos.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+  float s = sinf (x);
+  float c = cosf (x);
+
+  return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-19  3:43             ` Cesar Philippidis
@ 2016-05-19 11:29               ` Alexander Monakov
  2016-05-19 18:42                 ` Cesar Philippidis
  2016-05-19 23:12               ` Nathan Sidwell
  1 sibling, 1 reply; 12+ messages in thread
From: Alexander Monakov @ 2016-05-19 11:29 UTC (permalink / raw)
  To: Cesar Philippidis
  Cc: Nathan Sidwell, Andrew Pinski, Richard Biener, gcc-patches

On Wed, 18 May 2016, Cesar Philippidis wrote:
> >> +(define_expand "sincossf3"
> >> +  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
> >> +    (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
> >> +               UNSPEC_COS))
> >> +   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
> >> +    (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
> >> +  "flag_unsafe_math_optimizations"
> >> +{
> >> +  emit_insn (gen_sinsf2 (operands[1], operands[2]));
> >> +  emit_insn (gen_cossf2 (operands[0], operands[2]));
> >> +
> >> +  DONE;
> >> +})
> > 
> > Why the emit_insn code?  that seems to be replicating the RTL
> > representation -- you're saying the same thing twice.
> > 
> > Doesn't operands[2] need (conditionally) copying to a new register --
> > what if it aliases operands[1]?
> 
> This patch does that now.

Note that the documentation suggests using 'make_safe_from' to concisely
express conflict resolution:

> diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
> index 33a4862..69bbb22 100644
> --- a/gcc/config/nvptx/nvptx.md
> +++ b/gcc/config/nvptx/nvptx.md
> @@ -794,6 +794,24 @@
>    ""
>    "%.\\tsqrt%#%t0\\t%0, %1;")
>  
> +(define_expand "sincossf3"
> +  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
> +	(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
> +	           UNSPEC_COS))
> +   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
> +	(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
> +  "flag_unsafe_math_optimizations"

... here instead of special-casing the conflict case in curly braces you can
just write:

    "operands[2] = make_safe_from (operands[2], operands[0]);"

> +{
> +  if (REGNO (operands[0]) == REGNO (operands[2]))
> +    {
> +      rtx tmp = gen_reg_rtx (GET_MODE (operands[2]));
> +      emit_insn (gen_rtx_SET (tmp, operands[2]));
> +      emit_insn (gen_sinsf2 (operands[1], tmp));
> +      emit_insn (gen_cossf2 (operands[0], tmp));
> +      DONE;
> +    }
> +})

Alexander

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-19 11:29               ` Alexander Monakov
@ 2016-05-19 18:42                 ` Cesar Philippidis
  2016-05-19 23:30                   ` Nathan Sidwell
  0 siblings, 1 reply; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-19 18:42 UTC (permalink / raw)
  To: Alexander Monakov
  Cc: Nathan Sidwell, Andrew Pinski, Richard Biener, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1316 bytes --]

On 05/19/2016 04:29 AM, Alexander Monakov wrote:
> On Wed, 18 May 2016, Cesar Philippidis wrote:

> Note that the documentation suggests using 'make_safe_from' to concisely
> express conflict resolution:
> 
>> diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
>> index 33a4862..69bbb22 100644
>> --- a/gcc/config/nvptx/nvptx.md
>> +++ b/gcc/config/nvptx/nvptx.md
>> @@ -794,6 +794,24 @@
>>    ""
>>    "%.\\tsqrt%#%t0\\t%0, %1;")
>>  
>> +(define_expand "sincossf3"
>> +  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
>> +	(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
>> +	           UNSPEC_COS))
>> +   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
>> +	(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
>> +  "flag_unsafe_math_optimizations"
> 
> ... here instead of special-casing the conflict case in curly braces you can
> just write:
> 
>     "operands[2] = make_safe_from (operands[2], operands[0]);"
> 
>> +{
>> +  if (REGNO (operands[0]) == REGNO (operands[2]))
>> +    {
>> +      rtx tmp = gen_reg_rtx (GET_MODE (operands[2]));
>> +      emit_insn (gen_rtx_SET (tmp, operands[2]));
>> +      emit_insn (gen_sinsf2 (operands[1], tmp));
>> +      emit_insn (gen_cossf2 (operands[0], tmp));
>> +      DONE;
>> +    }
>> +})

Done. Is this ok for trunk?

Cesar


[-- Attachment #2: nvptx-sincos-20160519.diff --]
[-- Type: text/x-patch, Size: 1498 bytes --]

2016-05-19  Cesar Philippidis  <cesar@codesourcery.com>

	gcc/
	* config/nvptx/nvptx.md (sincossf3): New pattern.

	gcc/testsuite/
	* gcc.target/nvptx/sincos.c: New test.


diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..1dd256d 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -794,6 +794,16 @@
   ""
   "%.\\tsqrt%#%t0\\t%0, %1;")
 
+(define_expand "sincossf3"
+  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+	           UNSPEC_COS))
+   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+  "flag_unsafe_math_optimizations"
+  "operands[2] = make_safe_from (operands[2], operands[0]);"
+)
+
 (define_insn "sinsf2"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
 	(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos.c b/gcc/testsuite/gcc.target/nvptx/sincos.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+  float s = sinf (x);
+  float c = cosf (x);
+
+  return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-19  3:43             ` Cesar Philippidis
  2016-05-19 11:29               ` Alexander Monakov
@ 2016-05-19 23:12               ` Nathan Sidwell
  1 sibling, 0 replies; 12+ messages in thread
From: Nathan Sidwell @ 2016-05-19 23:12 UTC (permalink / raw)
  To: Cesar Philippidis, Andrew Pinski; +Cc: Richard Biener, gcc-patches

On 05/18/16 23:42, Cesar Philippidis wrote:

>>> +(define_expand "sincossf3"
>>> +  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
>>> +    (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
>>> +               UNSPEC_COS))
>>> +   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
>>> +    (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
>>> +  "flag_unsafe_math_optimizations"
>>> +{
>>> +  emit_insn (gen_sinsf2 (operands[1], operands[2]));
>>> +  emit_insn (gen_cossf2 (operands[0], operands[2]));
>>> +
>>> +  DONE;
>>> +})
>>
>> Why the emit_insn code?  that seems to be replicating the RTL
>> representation -- you're saying the same thing twice.

You've not answered this.  I always find it confusing when insn expansions say 
the same thing in the RTL represntation as the C fragment is generating.

> My intent was to verify that I got the sin and cos arguments right,
> i.e., make sure that this sincos expansion didn't mix up sin(x) with
> cos(x). I guess I can create a test that uses vprintf and scans

That's a good idea, but I think you need  a much clearer testcase. It'd be 
better to check both the sin and cos outputs (and, that's going to be 
'interesting' making sure only one path gets to be seen by the sincos optimizer 
-- but not difficult.

nathan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: inhibit the sincos optimization when the target has sin and cos instructions
  2016-05-19 18:42                 ` Cesar Philippidis
@ 2016-05-19 23:30                   ` Nathan Sidwell
  0 siblings, 0 replies; 12+ messages in thread
From: Nathan Sidwell @ 2016-05-19 23:30 UTC (permalink / raw)
  To: Cesar Philippidis, Alexander Monakov
  Cc: Andrew Pinski, Richard Biener, gcc-patches

On 05/19/16 14:42, Cesar Philippidis wrote:

> +  "operands[2] = make_safe_from (operands[2], operands[0]);"


Please use { ... } rather than "" for readability. Ok  with that change.

nathan

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-05-19 23:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-13 19:19 inhibit the sincos optimization when the target has sin and cos instructions Cesar Philippidis
2016-05-13 19:58 ` Richard Biener
2016-05-13 20:13   ` Andrew Pinski
2016-05-17 21:10     ` Cesar Philippidis
2016-05-17 21:23       ` Andrew Pinski
2016-05-17 21:30         ` Cesar Philippidis
2016-05-18 12:29           ` Nathan Sidwell
2016-05-19  3:43             ` Cesar Philippidis
2016-05-19 11:29               ` Alexander Monakov
2016-05-19 18:42                 ` Cesar Philippidis
2016-05-19 23:30                   ` Nathan Sidwell
2016-05-19 23:12               ` Nathan Sidwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).