* inhibit the sincos optimization when the target has sin and cos instructions
@ 2016-05-13 19:19 Cesar Philippidis
2016-05-13 19:58 ` Richard Biener
0 siblings, 1 reply; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-13 19:19 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 506 bytes --]
The cse_sincos pass tries to optimize sequences such as
sin (x);
cos (x);
into a single call to sincos, or cexpi, when available. However, the
nvptx target has sin and cos instructions, albeit with some loss of
precision (so it's only enabled with -ffast-math). This patch teaches
cse_sincos pass to ignore sin, cos and cexpi instructions when the
target can expand those calls. This yields a 6x speedup in 314.omriq
from spec accel when running on Nvidia accelerators.
Is this OK for trunk?
Cesar
[-- Attachment #2: inhibit-sincos.diff --]
[-- Type: text/x-patch, Size: 1473 bytes --]
2016-05-13 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* tree-ssa-math-opts.c (pass_cse_sincos::execute): Don't optimize
sin and cos calls when the target has instructions for them.
gcc/testsuite/
* gcc.target/nvptx/sincos.c: New test.
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos.c b/gcc/testsuite/gcc.target/nvptx/sincos.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+ float s = sinf (x);
+ float c = cosf (x);
+
+ return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 81688cd..38051e1 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1806,6 +1806,11 @@ pass_cse_sincos::execute (function *fun)
CASE_CFN_COS:
CASE_CFN_SIN:
CASE_CFN_CEXPI:
+ /* Don't modify these calls if they can be translated
+ directly into hardware instructions. */
+ if (replacement_internal_fn (as_a <gcall *> (stmt))
+ != IFN_LAST)
+ break;
/* Make sure we have either sincos or cexp. */
if (!targetm.libc_has_function (function_c99_math_complex)
&& !targetm.libc_has_function (function_sincos))
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-13 19:19 inhibit the sincos optimization when the target has sin and cos instructions Cesar Philippidis
@ 2016-05-13 19:58 ` Richard Biener
2016-05-13 20:13 ` Andrew Pinski
0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2016-05-13 19:58 UTC (permalink / raw)
To: Cesar Philippidis, gcc-patches
On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>The cse_sincos pass tries to optimize sequences such as
>
> sin (x);
> cos (x);
>
>into a single call to sincos, or cexpi, when available. However, the
>nvptx target has sin and cos instructions, albeit with some loss of
>precision (so it's only enabled with -ffast-math). This patch teaches
>cse_sincos pass to ignore sin, cos and cexpi instructions when the
>target can expand those calls. This yields a 6x speedup in 314.omriq
>from spec accel when running on Nvidia accelerators.
>
>Is this OK for trunk?
Isn't there an optab for sincos? ISTR x87 handles this pass just fine and also can do sin and cos.
Richard.
>Cesar
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-13 19:58 ` Richard Biener
@ 2016-05-13 20:13 ` Andrew Pinski
2016-05-17 21:10 ` Cesar Philippidis
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Pinski @ 2016-05-13 20:13 UTC (permalink / raw)
To: Richard Biener; +Cc: Cesar Philippidis, gcc-patches
On Fri, May 13, 2016 at 12:58 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>>The cse_sincos pass tries to optimize sequences such as
>>
>> sin (x);
>> cos (x);
>>
>>into a single call to sincos, or cexpi, when available. However, the
>>nvptx target has sin and cos instructions, albeit with some loss of
>>precision (so it's only enabled with -ffast-math). This patch teaches
>>cse_sincos pass to ignore sin, cos and cexpi instructions when the
>>target can expand those calls. This yields a 6x speedup in 314.omriq
>>from spec accel when running on Nvidia accelerators.
>>
>>Is this OK for trunk?
>
> Isn't there an optab for sincos?
This is exactly what I was going to suggest. This transformation
should be done in the back-end back to sin/cos instructions.
Thanks,
Andrew
> ISTR x87 handles this pass just fine and also can do sin and cos.
>
> Richard.
>
>>Cesar
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-13 20:13 ` Andrew Pinski
@ 2016-05-17 21:10 ` Cesar Philippidis
2016-05-17 21:23 ` Andrew Pinski
0 siblings, 1 reply; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-17 21:10 UTC (permalink / raw)
To: Andrew Pinski, Richard Biener, Nathan Sidwell; +Cc: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 5247 bytes --]
On 05/13/2016 01:13 PM, Andrew Pinski wrote:
> On Fri, May 13, 2016 at 12:58 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>>> The cse_sincos pass tries to optimize sequences such as
>>>
>>> sin (x);
>>> cos (x);
>>>
>>> into a single call to sincos, or cexpi, when available. However, the
>>> nvptx target has sin and cos instructions, albeit with some loss of
>>> precision (so it's only enabled with -ffast-math). This patch teaches
>>> cse_sincos pass to ignore sin, cos and cexpi instructions when the
>>> target can expand those calls. This yields a 6x speedup in 314.omriq
>> >from spec accel when running on Nvidia accelerators.
>>>
>>> Is this OK for trunk?
>>
>> Isn't there an optab for sincos?
>
> This is exactly what I was going to suggest. This transformation
> should be done in the back-end back to sin/cos instructions.
I didn't realize that the 387 has sin, cos and sincos instructions,
so yeah, my original patch is bad.
Nathan, is this patch ok for trunk and gcc-6? It adds a new sincos
pattern in the nvptx backend. I haven't testing a standalone nvptx
toolchain prior to this patch, so I'm not sure if my test results
look sane. I seem to be getting a different set of failures when I
test a clean trunk build multiple times. I attached my results
below for reference.
Cesar
g++.sum
Tests that now fail, but worked before:
nvptx-none-run: g++.dg/abi/param1.C -std=c++14 execution test
Tests that now work, but didn't before:
nvptx-none-run: g++.dg/opt/pr30590.C -std=gnu++98 execution test
nvptx-none-run: g++.dg/opt/pr36187.C -std=gnu++14 execution test
gfortran.sum
Tests that now fail, but worked before:
nvptx-none-run: gfortran.dg/alloc_comp_assign_10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
nvptx-none-run: gfortran.dg/allocate_with_source_5.f90 -O1 execution test
nvptx-none-run: gfortran.dg/func_assign_3.f90 -O3 -g execution test
nvptx-none-run: gfortran.dg/inline_sum_3.f90 -O1 execution test
nvptx-none-run: gfortran.dg/inline_sum_3.f90 -O3 -g execution test
nvptx-none-run: gfortran.dg/internal_pack_15.f90 -O2 execution test
nvptx-none-run: gfortran.dg/internal_pack_8.f90 -Os execution test
nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90 -O0 execution test
nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
nvptx-none-run: gfortran.dg/intrinsic_pack_5.f90 -O3 -g execution test
nvptx-none-run: gfortran.dg/intrinsic_product_1.f90 -O1 execution test
nvptx-none-run: gfortran.dg/intrinsic_verify_1.f90 -O3 -g execution test
nvptx-none-run: gfortran.dg/is_iostat_end_eor_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
nvptx-none-run: gfortran.dg/iso_c_binding_rename_1.f03 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
Tests that now work, but didn't before:
nvptx-none-run: gfortran.dg/char_pointer_assign.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
nvptx-none-run: gfortran.dg/char_pointer_dummy.f90 -O1 execution test
nvptx-none-run: gfortran.dg/char_pointer_dummy.f90 -Os execution test
nvptx-none-run: gfortran.dg/char_result_13.f90 -O3 -g execution test
nvptx-none-run: gfortran.dg/char_result_2.f90 -O1 execution test
nvptx-none-run: gfortran.dg/char_type_len.f90 -Os execution test
nvptx-none-run: gfortran.dg/character_array_constructor_1.f90 -O0 execution test
nvptx-none-run: gfortran.dg/nested_allocatables_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
gcc.sum
Tests that now fail, but worked before:
nvptx-none-run: gcc.c-torture/execute/20100316-1.c -Os execution test
nvptx-none-run: gcc.c-torture/execute/20100708-1.c -O1 execution test
nvptx-none-run: gcc.c-torture/execute/20100805-1.c -O0 execution test
nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -g execution test
Tests that now work, but didn't before:
nvptx-none-run: gcc.c-torture/execute/20091229-1.c -O3 -g execution test
nvptx-none-run: gcc.c-torture/execute/20101013-1.c -Os execution test
nvptx-none-run: gcc.c-torture/execute/20101025-1.c -Os execution test
nvptx-none-run: gcc.c-torture/execute/20120105-1.c -O0 execution test
nvptx-none-run: gcc.c-torture/execute/20120111-1.c -O0 execution test
New tests that PASS:
nvptx-none-run: gcc.target/nvptx/sincos-1.c (test for excess errors)
nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times cos.approx.f32 1
nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times sin.approx.f32 1
nvptx-none-run: gcc.target/nvptx/sincos-2.c (test for excess errors)
nvptx-none-run: gcc.target/nvptx/sincos-2.c execution test
>> ISTR x87 handles this pass just fine and also can do sin and cos.
>>
>> Richard.
>>
>>> Cesar
>>
>>
[-- Attachment #2: nvptx-sincos-20160517.diff --]
[-- Type: text/x-patch, Size: 2398 bytes --]
2016-05-17 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* config/nvptx/nvptx.md (unspec): Add UNSPEC_SINCOS.
(sincossf3): New pattern.
gcc/testsuite/
* gcc.target/nvptx/sincos-1.c: New test.
* gcc.target/nvptx/sincos-2.c: New test.
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..03a2f67 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -26,6 +26,7 @@
UNSPEC_EXP2
UNSPEC_SIN
UNSPEC_COS
+ UNSPEC_SINCOS
UNSPEC_FPINT_FLOOR
UNSPEC_FPINT_BTRUNC
@@ -794,6 +795,20 @@
""
"%.\\tsqrt%#%t0\\t%0, %1;")
+(define_expand "sincossf3"
+ [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+ (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+ UNSPEC_COS))
+ (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+ (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+ "flag_unsafe_math_optimizations"
+{
+ emit_insn (gen_sinsf2 (operands[1], operands[2]));
+ emit_insn (gen_cossf2 (operands[0], operands[2]));
+
+ DONE;
+})
+
(define_insn "sinsf2"
[(set (match_operand:SF 0 "nvptx_register_operand" "=R")
(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-1.c b/gcc/testsuite/gcc.target/nvptx/sincos-1.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+ float s = sinf (x);
+ float c = cosf (x);
+
+ return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-2.c b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
new file mode 100644
index 0000000..b617a7c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math" } */
+
+#include <assert.h>
+
+extern float sinf (float);
+extern float cosf (float);
+
+float val = 1.0;
+
+float
+test_sincos (float x, float other_cos)
+{
+ float s = sinf (x);
+ float c = cosf (x);
+
+ assert (c == other_cos);
+
+ return s + c;
+}
+
+int
+main ()
+{
+ float c = cosf (val);
+
+ test_sincos (val, c);
+
+ return 0;
+}
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-17 21:10 ` Cesar Philippidis
@ 2016-05-17 21:23 ` Andrew Pinski
2016-05-17 21:30 ` Cesar Philippidis
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Pinski @ 2016-05-17 21:23 UTC (permalink / raw)
To: Cesar Philippidis; +Cc: Richard Biener, Nathan Sidwell, gcc-patches
On Tue, May 17, 2016 at 2:10 PM, Cesar Philippidis
<cesar@codesourcery.com> wrote:
> On 05/13/2016 01:13 PM, Andrew Pinski wrote:
>> On Fri, May 13, 2016 at 12:58 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>>>> The cse_sincos pass tries to optimize sequences such as
>>>>
>>>> sin (x);
>>>> cos (x);
>>>>
>>>> into a single call to sincos, or cexpi, when available. However, the
>>>> nvptx target has sin and cos instructions, albeit with some loss of
>>>> precision (so it's only enabled with -ffast-math). This patch teaches
>>>> cse_sincos pass to ignore sin, cos and cexpi instructions when the
>>>> target can expand those calls. This yields a 6x speedup in 314.omriq
>>> >from spec accel when running on Nvidia accelerators.
>>>>
>>>> Is this OK for trunk?
>>>
>>> Isn't there an optab for sincos?
>>
>> This is exactly what I was going to suggest. This transformation
>> should be done in the back-end back to sin/cos instructions.
>
> I didn't realize that the 387 has sin, cos and sincos instructions,
> so yeah, my original patch is bad.
>
> Nathan, is this patch ok for trunk and gcc-6? It adds a new sincos
> pattern in the nvptx backend. I haven't testing a standalone nvptx
> toolchain prior to this patch, so I'm not sure if my test results
> look sane. I seem to be getting a different set of failures when I
> test a clean trunk build multiple times. I attached my results
> below for reference.
UNSPEC_SINCOS is unused so why add it?
Thanks,
Andrew Pinski
>
> Cesar
>
> g++.sum
> Tests that now fail, but worked before:
>
> nvptx-none-run: g++.dg/abi/param1.C -std=c++14 execution test
>
> Tests that now work, but didn't before:
>
> nvptx-none-run: g++.dg/opt/pr30590.C -std=gnu++98 execution test
> nvptx-none-run: g++.dg/opt/pr36187.C -std=gnu++14 execution test
>
> gfortran.sum
> Tests that now fail, but worked before:
>
> nvptx-none-run: gfortran.dg/alloc_comp_assign_10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
> nvptx-none-run: gfortran.dg/allocate_with_source_5.f90 -O1 execution test
> nvptx-none-run: gfortran.dg/func_assign_3.f90 -O3 -g execution test
> nvptx-none-run: gfortran.dg/inline_sum_3.f90 -O1 execution test
> nvptx-none-run: gfortran.dg/inline_sum_3.f90 -O3 -g execution test
> nvptx-none-run: gfortran.dg/internal_pack_15.f90 -O2 execution test
> nvptx-none-run: gfortran.dg/internal_pack_8.f90 -Os execution test
> nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90 -O0 execution test
> nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
> nvptx-none-run: gfortran.dg/intrinsic_pack_5.f90 -O3 -g execution test
> nvptx-none-run: gfortran.dg/intrinsic_product_1.f90 -O1 execution test
> nvptx-none-run: gfortran.dg/intrinsic_verify_1.f90 -O3 -g execution test
> nvptx-none-run: gfortran.dg/is_iostat_end_eor_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
> nvptx-none-run: gfortran.dg/iso_c_binding_rename_1.f03 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>
> Tests that now work, but didn't before:
>
> nvptx-none-run: gfortran.dg/char_pointer_assign.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
> nvptx-none-run: gfortran.dg/char_pointer_dummy.f90 -O1 execution test
> nvptx-none-run: gfortran.dg/char_pointer_dummy.f90 -Os execution test
> nvptx-none-run: gfortran.dg/char_result_13.f90 -O3 -g execution test
> nvptx-none-run: gfortran.dg/char_result_2.f90 -O1 execution test
> nvptx-none-run: gfortran.dg/char_type_len.f90 -Os execution test
> nvptx-none-run: gfortran.dg/character_array_constructor_1.f90 -O0 execution test
> nvptx-none-run: gfortran.dg/nested_allocatables_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>
> gcc.sum
> Tests that now fail, but worked before:
>
> nvptx-none-run: gcc.c-torture/execute/20100316-1.c -Os execution test
> nvptx-none-run: gcc.c-torture/execute/20100708-1.c -O1 execution test
> nvptx-none-run: gcc.c-torture/execute/20100805-1.c -O0 execution test
> nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
> nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -g execution test
>
> Tests that now work, but didn't before:
>
> nvptx-none-run: gcc.c-torture/execute/20091229-1.c -O3 -g execution test
> nvptx-none-run: gcc.c-torture/execute/20101013-1.c -Os execution test
> nvptx-none-run: gcc.c-torture/execute/20101025-1.c -Os execution test
> nvptx-none-run: gcc.c-torture/execute/20120105-1.c -O0 execution test
> nvptx-none-run: gcc.c-torture/execute/20120111-1.c -O0 execution test
>
> New tests that PASS:
>
> nvptx-none-run: gcc.target/nvptx/sincos-1.c (test for excess errors)
> nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times cos.approx.f32 1
> nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times sin.approx.f32 1
> nvptx-none-run: gcc.target/nvptx/sincos-2.c (test for excess errors)
> nvptx-none-run: gcc.target/nvptx/sincos-2.c execution test
>
>
>>> ISTR x87 handles this pass just fine and also can do sin and cos.
>>>
>>> Richard.
>>>
>>>> Cesar
>>>
>>>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-17 21:23 ` Andrew Pinski
@ 2016-05-17 21:30 ` Cesar Philippidis
2016-05-18 12:29 ` Nathan Sidwell
0 siblings, 1 reply; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-17 21:30 UTC (permalink / raw)
To: Andrew Pinski; +Cc: Richard Biener, Nathan Sidwell, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 5811 bytes --]
On 05/17/2016 02:22 PM, Andrew Pinski wrote:
> On Tue, May 17, 2016 at 2:10 PM, Cesar Philippidis
> <cesar@codesourcery.com> wrote:
>> On 05/13/2016 01:13 PM, Andrew Pinski wrote:
>>> On Fri, May 13, 2016 at 12:58 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis <cesar@codesourcery.com> wrote:
>>>>> The cse_sincos pass tries to optimize sequences such as
>>>>>
>>>>> sin (x);
>>>>> cos (x);
>>>>>
>>>>> into a single call to sincos, or cexpi, when available. However, the
>>>>> nvptx target has sin and cos instructions, albeit with some loss of
>>>>> precision (so it's only enabled with -ffast-math). This patch teaches
>>>>> cse_sincos pass to ignore sin, cos and cexpi instructions when the
>>>>> target can expand those calls. This yields a 6x speedup in 314.omriq
>>>> >from spec accel when running on Nvidia accelerators.
>>>>>
>>>>> Is this OK for trunk?
>>>>
>>>> Isn't there an optab for sincos?
>>>
>>> This is exactly what I was going to suggest. This transformation
>>> should be done in the back-end back to sin/cos instructions.
>>
>> I didn't realize that the 387 has sin, cos and sincos instructions,
>> so yeah, my original patch is bad.
>>
>> Nathan, is this patch ok for trunk and gcc-6? It adds a new sincos
>> pattern in the nvptx backend. I haven't testing a standalone nvptx
>> toolchain prior to this patch, so I'm not sure if my test results
>> look sane. I seem to be getting a different set of failures when I
>> test a clean trunk build multiple times. I attached my results
>> below for reference.
>
>
> UNSPEC_SINCOS is unused so why add it?
Good eyes, thanks! I thought I had to create a new insn, but I got away
with an expand. I attached the updated patch.
Cesar
>> g++.sum
>> Tests that now fail, but worked before:
>>
>> nvptx-none-run: g++.dg/abi/param1.C -std=c++14 execution test
>>
>> Tests that now work, but didn't before:
>>
>> nvptx-none-run: g++.dg/opt/pr30590.C -std=gnu++98 execution test
>> nvptx-none-run: g++.dg/opt/pr36187.C -std=gnu++14 execution test
>>
>> gfortran.sum
>> Tests that now fail, but worked before:
>>
>> nvptx-none-run: gfortran.dg/alloc_comp_assign_10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>> nvptx-none-run: gfortran.dg/allocate_with_source_5.f90 -O1 execution test
>> nvptx-none-run: gfortran.dg/func_assign_3.f90 -O3 -g execution test
>> nvptx-none-run: gfortran.dg/inline_sum_3.f90 -O1 execution test
>> nvptx-none-run: gfortran.dg/inline_sum_3.f90 -O3 -g execution test
>> nvptx-none-run: gfortran.dg/internal_pack_15.f90 -O2 execution test
>> nvptx-none-run: gfortran.dg/internal_pack_8.f90 -Os execution test
>> nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90 -O0 execution test
>> nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>> nvptx-none-run: gfortran.dg/intrinsic_pack_5.f90 -O3 -g execution test
>> nvptx-none-run: gfortran.dg/intrinsic_product_1.f90 -O1 execution test
>> nvptx-none-run: gfortran.dg/intrinsic_verify_1.f90 -O3 -g execution test
>> nvptx-none-run: gfortran.dg/is_iostat_end_eor_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>> nvptx-none-run: gfortran.dg/iso_c_binding_rename_1.f03 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>>
>> Tests that now work, but didn't before:
>>
>> nvptx-none-run: gfortran.dg/char_pointer_assign.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>> nvptx-none-run: gfortran.dg/char_pointer_dummy.f90 -O1 execution test
>> nvptx-none-run: gfortran.dg/char_pointer_dummy.f90 -Os execution test
>> nvptx-none-run: gfortran.dg/char_result_13.f90 -O3 -g execution test
>> nvptx-none-run: gfortran.dg/char_result_2.f90 -O1 execution test
>> nvptx-none-run: gfortran.dg/char_type_len.f90 -Os execution test
>> nvptx-none-run: gfortran.dg/character_array_constructor_1.f90 -O0 execution test
>> nvptx-none-run: gfortran.dg/nested_allocatables_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>>
>> gcc.sum
>> Tests that now fail, but worked before:
>>
>> nvptx-none-run: gcc.c-torture/execute/20100316-1.c -Os execution test
>> nvptx-none-run: gcc.c-torture/execute/20100708-1.c -O1 execution test
>> nvptx-none-run: gcc.c-torture/execute/20100805-1.c -O0 execution test
>> nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>> nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -g execution test
>>
>> Tests that now work, but didn't before:
>>
>> nvptx-none-run: gcc.c-torture/execute/20091229-1.c -O3 -g execution test
>> nvptx-none-run: gcc.c-torture/execute/20101013-1.c -Os execution test
>> nvptx-none-run: gcc.c-torture/execute/20101025-1.c -Os execution test
>> nvptx-none-run: gcc.c-torture/execute/20120105-1.c -O0 execution test
>> nvptx-none-run: gcc.c-torture/execute/20120111-1.c -O0 execution test
>>
>> New tests that PASS:
>>
>> nvptx-none-run: gcc.target/nvptx/sincos-1.c (test for excess errors)
>> nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times cos.approx.f32 1
>> nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times sin.approx.f32 1
>> nvptx-none-run: gcc.target/nvptx/sincos-2.c (test for excess errors)
>> nvptx-none-run: gcc.target/nvptx/sincos-2.c execution test
>>
>>
>>>> ISTR x87 handles this pass just fine and also can do sin and cos.
>>>>
>>>> Richard.
>>>>
>>>>> Cesar
>>>>
>>>>
>>
[-- Attachment #2: nvptx-sincos-20160517b.diff --]
[-- Type: text/x-patch, Size: 2237 bytes --]
2016-05-17 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* config/nvptx/nvptx.md (sincossf3): New pattern.
gcc/testsuite/
* gcc.target/nvptx/sincos-1.c: New test.
* gcc.target/nvptx/sincos-2.c: New test.
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..29642ec 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -794,6 +794,20 @@
""
"%.\\tsqrt%#%t0\\t%0, %1;")
+(define_expand "sincossf3"
+ [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+ (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+ UNSPEC_COS))
+ (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+ (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+ "flag_unsafe_math_optimizations"
+{
+ emit_insn (gen_sinsf2 (operands[1], operands[2]));
+ emit_insn (gen_cossf2 (operands[0], operands[2]));
+
+ DONE;
+})
+
(define_insn "sinsf2"
[(set (match_operand:SF 0 "nvptx_register_operand" "=R")
(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-1.c b/gcc/testsuite/gcc.target/nvptx/sincos-1.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+ float s = sinf (x);
+ float c = cosf (x);
+
+ return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-2.c b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
new file mode 100644
index 0000000..b617a7c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math" } */
+
+#include <assert.h>
+
+extern float sinf (float);
+extern float cosf (float);
+
+float val = 1.0;
+
+float
+test_sincos (float x, float other_cos)
+{
+ float s = sinf (x);
+ float c = cosf (x);
+
+ assert (c == other_cos);
+
+ return s + c;
+}
+
+int
+main ()
+{
+ float c = cosf (val);
+
+ test_sincos (val, c);
+
+ return 0;
+}
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-17 21:30 ` Cesar Philippidis
@ 2016-05-18 12:29 ` Nathan Sidwell
2016-05-19 3:43 ` Cesar Philippidis
0 siblings, 1 reply; 12+ messages in thread
From: Nathan Sidwell @ 2016-05-18 12:29 UTC (permalink / raw)
To: Cesar Philippidis, Andrew Pinski; +Cc: Richard Biener, gcc-patches
On 05/17/16 17:30, Cesar Philippidis wrote:
> On 05/17/2016 02:22 PM, Andrew Pinski wrote:
> Good eyes, thanks! I thought I had to create a new insn, but I got away
> with an expand. I attached the updated patch.
>
> Cesar
>>> gcc.sum
>>> Tests that now fail, but worked before:
>>>
>>> nvptx-none-run: gcc.c-torture/execute/20100316-1.c -Os execution test
>>> nvptx-none-run: gcc.c-torture/execute/20100708-1.c -O1 execution test
>>> nvptx-none-run: gcc.c-torture/execute/20100805-1.c -O0 execution test
>>> nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>>> nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -g execution test
>>>
Please determine why these now fail.
> +(define_expand "sincossf3"
> + [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
> + (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
> + UNSPEC_COS))
> + (set (match_operand:SF 1 "nvptx_register_operand" "=R")
> + (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
> + "flag_unsafe_math_optimizations"
> +{
> + emit_insn (gen_sinsf2 (operands[1], operands[2]));
> + emit_insn (gen_cossf2 (operands[0], operands[2]));
> +
> + DONE;
> +})
Why the emit_insn code? that seems to be replicating the RTL representation --
you're saying the same thing twice.
Doesn't operands[2] need (conditionally) copying to a new register -- what if it
aliases operands[1]?
> +++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
> @@ -0,0 +1,30 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math" } */
> +
What is this test trying to test? I'm puzzled by it. (btw, don't use assert,
either abort, exit(1) or return from main.)
nathan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-18 12:29 ` Nathan Sidwell
@ 2016-05-19 3:43 ` Cesar Philippidis
2016-05-19 11:29 ` Alexander Monakov
2016-05-19 23:12 ` Nathan Sidwell
0 siblings, 2 replies; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-19 3:43 UTC (permalink / raw)
To: Nathan Sidwell, Andrew Pinski; +Cc: Richard Biener, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2169 bytes --]
On 05/18/2016 05:29 AM, Nathan Sidwell wrote:
> On 05/17/16 17:30, Cesar Philippidis wrote:
>> On 05/17/2016 02:22 PM, Andrew Pinski wrote:
>>>> gcc.sum
>>>> Tests that now fail, but worked before:
>>>>
>>>> nvptx-none-run: gcc.c-torture/execute/20100316-1.c -Os execution
>>>> test
>>>> nvptx-none-run: gcc.c-torture/execute/20100708-1.c -O1 execution
>>>> test
>>>> nvptx-none-run: gcc.c-torture/execute/20100805-1.c -O0 execution
>>>> test
>>>> nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer
>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
>>>> nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -g execution test
>>>>
>
> Please determine why these now fail.
Those were failing intermittently, at least on my desktop. I'll look
into that it next.
>> +(define_expand "sincossf3"
>> + [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
>> + (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
>> + UNSPEC_COS))
>> + (set (match_operand:SF 1 "nvptx_register_operand" "=R")
>> + (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
>> + "flag_unsafe_math_optimizations"
>> +{
>> + emit_insn (gen_sinsf2 (operands[1], operands[2]));
>> + emit_insn (gen_cossf2 (operands[0], operands[2]));
>> +
>> + DONE;
>> +})
>
> Why the emit_insn code? that seems to be replicating the RTL
> representation -- you're saying the same thing twice.
>
> Doesn't operands[2] need (conditionally) copying to a new register --
> what if it aliases operands[1]?
This patch does that now.
>> +++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
>> @@ -0,0 +1,30 @@
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 -ffast-math" } */
>> +
>
> What is this test trying to test? I'm puzzled by it. (btw, don't use
> assert, either abort, exit(1) or return from main.)
My intent was to verify that I got the sin and cos arguments right,
i.e., make sure that this sincos expansion didn't mix up sin(x) with
cos(x). I guess I can create a test that uses vprintf and scans
dg-output for the proper results. But in this patch I just omitted that
test case altogether.
Is this patch ok for trunk?
Cesar
[-- Attachment #2: nvptx-sincos-20160518.diff --]
[-- Type: text/x-patch, Size: 1724 bytes --]
2016-05-18 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* config/nvptx/nvptx.md (sincossf3): New pattern.
gcc/testsuite/
* gcc.target/nvptx/sincos.c: New test.
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..69bbb22 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -794,6 +794,24 @@
""
"%.\\tsqrt%#%t0\\t%0, %1;")
+(define_expand "sincossf3"
+ [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+ (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+ UNSPEC_COS))
+ (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+ (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+ "flag_unsafe_math_optimizations"
+{
+ if (REGNO (operands[0]) == REGNO (operands[2]))
+ {
+ rtx tmp = gen_reg_rtx (GET_MODE (operands[2]));
+ emit_insn (gen_rtx_SET (tmp, operands[2]));
+ emit_insn (gen_sinsf2 (operands[1], tmp));
+ emit_insn (gen_cossf2 (operands[0], tmp));
+ DONE;
+ }
+})
+
(define_insn "sinsf2"
[(set (match_operand:SF 0 "nvptx_register_operand" "=R")
(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos.c b/gcc/testsuite/gcc.target/nvptx/sincos.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+ float s = sinf (x);
+ float c = cosf (x);
+
+ return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-19 3:43 ` Cesar Philippidis
@ 2016-05-19 11:29 ` Alexander Monakov
2016-05-19 18:42 ` Cesar Philippidis
2016-05-19 23:12 ` Nathan Sidwell
1 sibling, 1 reply; 12+ messages in thread
From: Alexander Monakov @ 2016-05-19 11:29 UTC (permalink / raw)
To: Cesar Philippidis
Cc: Nathan Sidwell, Andrew Pinski, Richard Biener, gcc-patches
On Wed, 18 May 2016, Cesar Philippidis wrote:
> >> +(define_expand "sincossf3"
> >> + [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
> >> + (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
> >> + UNSPEC_COS))
> >> + (set (match_operand:SF 1 "nvptx_register_operand" "=R")
> >> + (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
> >> + "flag_unsafe_math_optimizations"
> >> +{
> >> + emit_insn (gen_sinsf2 (operands[1], operands[2]));
> >> + emit_insn (gen_cossf2 (operands[0], operands[2]));
> >> +
> >> + DONE;
> >> +})
> >
> > Why the emit_insn code? that seems to be replicating the RTL
> > representation -- you're saying the same thing twice.
> >
> > Doesn't operands[2] need (conditionally) copying to a new register --
> > what if it aliases operands[1]?
>
> This patch does that now.
Note that the documentation suggests using 'make_safe_from' to concisely
express conflict resolution:
> diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
> index 33a4862..69bbb22 100644
> --- a/gcc/config/nvptx/nvptx.md
> +++ b/gcc/config/nvptx/nvptx.md
> @@ -794,6 +794,24 @@
> ""
> "%.\\tsqrt%#%t0\\t%0, %1;")
>
> +(define_expand "sincossf3"
> + [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
> + (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
> + UNSPEC_COS))
> + (set (match_operand:SF 1 "nvptx_register_operand" "=R")
> + (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
> + "flag_unsafe_math_optimizations"
... here instead of special-casing the conflict case in curly braces you can
just write:
"operands[2] = make_safe_from (operands[2], operands[0]);"
> +{
> + if (REGNO (operands[0]) == REGNO (operands[2]))
> + {
> + rtx tmp = gen_reg_rtx (GET_MODE (operands[2]));
> + emit_insn (gen_rtx_SET (tmp, operands[2]));
> + emit_insn (gen_sinsf2 (operands[1], tmp));
> + emit_insn (gen_cossf2 (operands[0], tmp));
> + DONE;
> + }
> +})
Alexander
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-19 11:29 ` Alexander Monakov
@ 2016-05-19 18:42 ` Cesar Philippidis
2016-05-19 23:30 ` Nathan Sidwell
0 siblings, 1 reply; 12+ messages in thread
From: Cesar Philippidis @ 2016-05-19 18:42 UTC (permalink / raw)
To: Alexander Monakov
Cc: Nathan Sidwell, Andrew Pinski, Richard Biener, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1316 bytes --]
On 05/19/2016 04:29 AM, Alexander Monakov wrote:
> On Wed, 18 May 2016, Cesar Philippidis wrote:
> Note that the documentation suggests using 'make_safe_from' to concisely
> express conflict resolution:
>
>> diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
>> index 33a4862..69bbb22 100644
>> --- a/gcc/config/nvptx/nvptx.md
>> +++ b/gcc/config/nvptx/nvptx.md
>> @@ -794,6 +794,24 @@
>> ""
>> "%.\\tsqrt%#%t0\\t%0, %1;")
>>
>> +(define_expand "sincossf3"
>> + [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
>> + (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
>> + UNSPEC_COS))
>> + (set (match_operand:SF 1 "nvptx_register_operand" "=R")
>> + (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
>> + "flag_unsafe_math_optimizations"
>
> ... here instead of special-casing the conflict case in curly braces you can
> just write:
>
> "operands[2] = make_safe_from (operands[2], operands[0]);"
>
>> +{
>> + if (REGNO (operands[0]) == REGNO (operands[2]))
>> + {
>> + rtx tmp = gen_reg_rtx (GET_MODE (operands[2]));
>> + emit_insn (gen_rtx_SET (tmp, operands[2]));
>> + emit_insn (gen_sinsf2 (operands[1], tmp));
>> + emit_insn (gen_cossf2 (operands[0], tmp));
>> + DONE;
>> + }
>> +})
Done. Is this ok for trunk?
Cesar
[-- Attachment #2: nvptx-sincos-20160519.diff --]
[-- Type: text/x-patch, Size: 1498 bytes --]
2016-05-19 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* config/nvptx/nvptx.md (sincossf3): New pattern.
gcc/testsuite/
* gcc.target/nvptx/sincos.c: New test.
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..1dd256d 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -794,6 +794,16 @@
""
"%.\\tsqrt%#%t0\\t%0, %1;")
+(define_expand "sincossf3"
+ [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+ (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+ UNSPEC_COS))
+ (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+ (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+ "flag_unsafe_math_optimizations"
+ "operands[2] = make_safe_from (operands[2], operands[0]);"
+)
+
(define_insn "sinsf2"
[(set (match_operand:SF 0 "nvptx_register_operand" "=R")
(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos.c b/gcc/testsuite/gcc.target/nvptx/sincos.c
new file mode 100644
index 0000000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+ float s = sinf (x);
+ float c = cosf (x);
+
+ return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-19 3:43 ` Cesar Philippidis
2016-05-19 11:29 ` Alexander Monakov
@ 2016-05-19 23:12 ` Nathan Sidwell
1 sibling, 0 replies; 12+ messages in thread
From: Nathan Sidwell @ 2016-05-19 23:12 UTC (permalink / raw)
To: Cesar Philippidis, Andrew Pinski; +Cc: Richard Biener, gcc-patches
On 05/18/16 23:42, Cesar Philippidis wrote:
>>> +(define_expand "sincossf3"
>>> + [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
>>> + (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
>>> + UNSPEC_COS))
>>> + (set (match_operand:SF 1 "nvptx_register_operand" "=R")
>>> + (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
>>> + "flag_unsafe_math_optimizations"
>>> +{
>>> + emit_insn (gen_sinsf2 (operands[1], operands[2]));
>>> + emit_insn (gen_cossf2 (operands[0], operands[2]));
>>> +
>>> + DONE;
>>> +})
>>
>> Why the emit_insn code? that seems to be replicating the RTL
>> representation -- you're saying the same thing twice.
You've not answered this. I always find it confusing when insn expansions say
the same thing in the RTL represntation as the C fragment is generating.
> My intent was to verify that I got the sin and cos arguments right,
> i.e., make sure that this sincos expansion didn't mix up sin(x) with
> cos(x). I guess I can create a test that uses vprintf and scans
That's a good idea, but I think you need a much clearer testcase. It'd be
better to check both the sin and cos outputs (and, that's going to be
'interesting' making sure only one path gets to be seen by the sincos optimizer
-- but not difficult.
nathan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inhibit the sincos optimization when the target has sin and cos instructions
2016-05-19 18:42 ` Cesar Philippidis
@ 2016-05-19 23:30 ` Nathan Sidwell
0 siblings, 0 replies; 12+ messages in thread
From: Nathan Sidwell @ 2016-05-19 23:30 UTC (permalink / raw)
To: Cesar Philippidis, Alexander Monakov
Cc: Andrew Pinski, Richard Biener, gcc-patches
On 05/19/16 14:42, Cesar Philippidis wrote:
> + "operands[2] = make_safe_from (operands[2], operands[0]);"
Please use { ... } rather than "" for readability. Ok with that change.
nathan
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-05-19 23:30 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-13 19:19 inhibit the sincos optimization when the target has sin and cos instructions Cesar Philippidis
2016-05-13 19:58 ` Richard Biener
2016-05-13 20:13 ` Andrew Pinski
2016-05-17 21:10 ` Cesar Philippidis
2016-05-17 21:23 ` Andrew Pinski
2016-05-17 21:30 ` Cesar Philippidis
2016-05-18 12:29 ` Nathan Sidwell
2016-05-19 3:43 ` Cesar Philippidis
2016-05-19 11:29 ` Alexander Monakov
2016-05-19 18:42 ` Cesar Philippidis
2016-05-19 23:30 ` Nathan Sidwell
2016-05-19 23:12 ` Nathan Sidwell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).