From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 35346 invoked by alias); 17 May 2016 21:10:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 35324 invoked by uid 89); 17 May 2016 21:10:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,URIBL_RED autolearn=ham version=3.3.2 spammy=richard.guenther@gmail.com, richardguenthergmailcom, sk:UNSPEC_, sk:unspec_ X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 17 May 2016 21:10:11 +0000 Received: from svr-orw-fem-04.mgc.mentorg.com ([147.34.97.41]) by relay1.mentorg.com with esmtp id 1b2mFl-0004Ae-HZ from Cesar_Philippidis@mentor.com ; Tue, 17 May 2016 14:10:09 -0700 Received: from [127.0.0.1] (147.34.91.1) by svr-orw-fem-04.mgc.mentorg.com (147.34.97.41) with Microsoft SMTP Server id 14.3.224.2; Tue, 17 May 2016 14:10:08 -0700 Subject: Re: inhibit the sincos optimization when the target has sin and cos instructions To: Andrew Pinski , Richard Biener , Nathan Sidwell References: <573628A1.1030501@codesourcery.com> <862033F1-A268-4236-B908-558C102199B5@gmail.com> CC: "gcc-patches@gcc.gnu.org" From: Cesar Philippidis Message-ID: <573B88B0.2080508@codesourcery.com> Date: Tue, 17 May 2016 21:10:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------060504040803040000080203" X-SW-Source: 2016-05/txt/msg01283.txt.bz2 --------------060504040803040000080203 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-length: 5247 On 05/13/2016 01:13 PM, Andrew Pinski wrote: > On Fri, May 13, 2016 at 12:58 PM, Richard Biener > wrote: >> On May 13, 2016 9:18:57 PM GMT+02:00, Cesar Philippidis wrote: >>> The cse_sincos pass tries to optimize sequences such as >>> >>> sin (x); >>> cos (x); >>> >>> into a single call to sincos, or cexpi, when available. However, the >>> nvptx target has sin and cos instructions, albeit with some loss of >>> precision (so it's only enabled with -ffast-math). This patch teaches >>> cse_sincos pass to ignore sin, cos and cexpi instructions when the >>> target can expand those calls. This yields a 6x speedup in 314.omriq >> >from spec accel when running on Nvidia accelerators. >>> >>> Is this OK for trunk? >> >> Isn't there an optab for sincos? > > This is exactly what I was going to suggest. This transformation > should be done in the back-end back to sin/cos instructions. I didn't realize that the 387 has sin, cos and sincos instructions, so yeah, my original patch is bad. Nathan, is this patch ok for trunk and gcc-6? It adds a new sincos pattern in the nvptx backend. I haven't testing a standalone nvptx toolchain prior to this patch, so I'm not sure if my test results look sane. I seem to be getting a different set of failures when I test a clean trunk build multiple times. I attached my results below for reference. Cesar g++.sum Tests that now fail, but worked before: nvptx-none-run: g++.dg/abi/param1.C -std=c++14 execution test Tests that now work, but didn't before: nvptx-none-run: g++.dg/opt/pr30590.C -std=gnu++98 execution test nvptx-none-run: g++.dg/opt/pr36187.C -std=gnu++14 execution test gfortran.sum Tests that now fail, but worked before: nvptx-none-run: gfortran.dg/alloc_comp_assign_10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test nvptx-none-run: gfortran.dg/allocate_with_source_5.f90 -O1 execution test nvptx-none-run: gfortran.dg/func_assign_3.f90 -O3 -g execution test nvptx-none-run: gfortran.dg/inline_sum_3.f90 -O1 execution test nvptx-none-run: gfortran.dg/inline_sum_3.f90 -O3 -g execution test nvptx-none-run: gfortran.dg/internal_pack_15.f90 -O2 execution test nvptx-none-run: gfortran.dg/internal_pack_8.f90 -Os execution test nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90 -O0 execution test nvptx-none-run: gfortran.dg/intrinsic_ifunction_2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test nvptx-none-run: gfortran.dg/intrinsic_pack_5.f90 -O3 -g execution test nvptx-none-run: gfortran.dg/intrinsic_product_1.f90 -O1 execution test nvptx-none-run: gfortran.dg/intrinsic_verify_1.f90 -O3 -g execution test nvptx-none-run: gfortran.dg/is_iostat_end_eor_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test nvptx-none-run: gfortran.dg/iso_c_binding_rename_1.f03 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test Tests that now work, but didn't before: nvptx-none-run: gfortran.dg/char_pointer_assign.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test nvptx-none-run: gfortran.dg/char_pointer_dummy.f90 -O1 execution test nvptx-none-run: gfortran.dg/char_pointer_dummy.f90 -Os execution test nvptx-none-run: gfortran.dg/char_result_13.f90 -O3 -g execution test nvptx-none-run: gfortran.dg/char_result_2.f90 -O1 execution test nvptx-none-run: gfortran.dg/char_type_len.f90 -Os execution test nvptx-none-run: gfortran.dg/character_array_constructor_1.f90 -O0 execution test nvptx-none-run: gfortran.dg/nested_allocatables_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test gcc.sum Tests that now fail, but worked before: nvptx-none-run: gcc.c-torture/execute/20100316-1.c -Os execution test nvptx-none-run: gcc.c-torture/execute/20100708-1.c -O1 execution test nvptx-none-run: gcc.c-torture/execute/20100805-1.c -O0 execution test nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test nvptx-none-run: gcc.dg/torture/pr52028.c -O3 -g execution test Tests that now work, but didn't before: nvptx-none-run: gcc.c-torture/execute/20091229-1.c -O3 -g execution test nvptx-none-run: gcc.c-torture/execute/20101013-1.c -Os execution test nvptx-none-run: gcc.c-torture/execute/20101025-1.c -Os execution test nvptx-none-run: gcc.c-torture/execute/20120105-1.c -O0 execution test nvptx-none-run: gcc.c-torture/execute/20120111-1.c -O0 execution test New tests that PASS: nvptx-none-run: gcc.target/nvptx/sincos-1.c (test for excess errors) nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times cos.approx.f32 1 nvptx-none-run: gcc.target/nvptx/sincos-1.c scan-assembler-times sin.approx.f32 1 nvptx-none-run: gcc.target/nvptx/sincos-2.c (test for excess errors) nvptx-none-run: gcc.target/nvptx/sincos-2.c execution test >> ISTR x87 handles this pass just fine and also can do sin and cos. >> >> Richard. >> >>> Cesar >> >> --------------060504040803040000080203 Content-Type: text/x-patch; name="nvptx-sincos-20160517.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="nvptx-sincos-20160517.diff" Content-length: 2398 2016-05-17 Cesar Philippidis gcc/ * config/nvptx/nvptx.md (unspec): Add UNSPEC_SINCOS. (sincossf3): New pattern. gcc/testsuite/ * gcc.target/nvptx/sincos-1.c: New test. * gcc.target/nvptx/sincos-2.c: New test. diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md index 33a4862..03a2f67 100644 --- a/gcc/config/nvptx/nvptx.md +++ b/gcc/config/nvptx/nvptx.md @@ -26,6 +26,7 @@ UNSPEC_EXP2 UNSPEC_SIN UNSPEC_COS + UNSPEC_SINCOS UNSPEC_FPINT_FLOOR UNSPEC_FPINT_BTRUNC @@ -794,6 +795,20 @@ "" "%.\\tsqrt%#%t0\\t%0, %1;") +(define_expand "sincossf3" + [(set (match_operand:SF 0 "nvptx_register_operand" "=R") + (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")] + UNSPEC_COS)) + (set (match_operand:SF 1 "nvptx_register_operand" "=R") + (unspec:SF [(match_dup 2)] UNSPEC_SIN))] + "flag_unsafe_math_optimizations" +{ + emit_insn (gen_sinsf2 (operands[1], operands[2])); + emit_insn (gen_cossf2 (operands[0], operands[2])); + + DONE; +}) + (define_insn "sinsf2" [(set (match_operand:SF 0 "nvptx_register_operand" "=R") (unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")] diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-1.c b/gcc/testsuite/gcc.target/nvptx/sincos-1.c new file mode 100644 index 0000000..921ec41 --- /dev/null +++ b/gcc/testsuite/gcc.target/nvptx/sincos-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ffast-math" } */ + +extern float sinf (float); +extern float cosf (float); + +float +sincos_add (float x) +{ + float s = sinf (x); + float c = cosf (x); + + return s + c; +} + +/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */ +/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */ diff --git a/gcc/testsuite/gcc.target/nvptx/sincos-2.c b/gcc/testsuite/gcc.target/nvptx/sincos-2.c new file mode 100644 index 0000000..b617a7c --- /dev/null +++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -ffast-math" } */ + +#include + +extern float sinf (float); +extern float cosf (float); + +float val = 1.0; + +float +test_sincos (float x, float other_cos) +{ + float s = sinf (x); + float c = cosf (x); + + assert (c == other_cos); + + return s + c; +} + +int +main () +{ + float c = cosf (val); + + test_sincos (val, c); + + return 0; +} --------------060504040803040000080203--