From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 49C1C3858D33 for ; Wed, 1 Mar 2023 12:13:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 49C1C3858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,224,1673942400"; d="scan'208";a="98437562" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 01 Mar 2023 04:13:37 -0800 IronPort-SDR: 8FEqQpIQzG0BSiTpZW/jBpaqGnVWhdLdy3+7KdnJm19GEmQBaz2WjzTtUXBtfOX0i0toz75cN2 hqgLkJWk58ZGaBsRfjAEv48iXj/Nd0DVdfDDRDZJgV5NMCJJsLOhgRvaAGeQp1DJAesxVo71J8 DY/O7xjILBttBY8IcQhJJgj7tzEay7Rag1e0s3aJ23FNug26MqHVmfugk+uOAbixIqAlnSoZfe Ab2yCdQwQZOVL0Q40zLaqdqWofo1eKDqANYDLaWUPu+TNAwC9eEDpHiGGTp6VEOb65pVb5nV8T vJg= Message-ID: <04ae5eba-fe44-1ca2-a112-7aa2a383e188@codesourcery.com> Date: Wed, 1 Mar 2023 12:13:31 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH] amdgcn: Enable SIMD vectorization of math functions Content-Language: en-GB To: "Andre Vieira (lists)" , Kwok Cheung Yeung , gcc-patches References: From: Andrew Stubbs In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-6.3 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,KAM_SHORT,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 01/03/2023 10:52, Andre Vieira (lists) wrote: > > > On 01/03/2023 10:01, Andrew Stubbs wrote: > > On 28/02/2023 23:01, Kwok Cheung Yeung wrote: > >> Hello > >> > >> This patch implements the TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION > >> target hook for the AMD GCN architecture, such that when vectorized, > >> calls to builtin standard math functions such as asinf, exp, pow etc. > >> are converted to calls to the recently added vectorized math functions > >> for GCN in Newlib. The -fno-math-errno flag is required in addition to > >> the usual vectorization optimization flags for this to occur, and some > >> of the math functions (the larger double-precision ones) require a > >> large stack size to function properly. > >> > >> This patch requires the GCN vector math functions in Newlib to > >> function - these were included in the recent 4.3.0.20230120 snapshot. > >> As this was a minimum requirement starting from the patch 'amdgcn, > >> libgomp: Manually allocated stacks', this should not be a problem. > >> > >> I have added new testcases in the testsuite that compare the output of > >> the vectorized math functions against the scalar, passing if they are > >> sufficiently close. With the testcase for standalone GCN (without > >> libgomp) in gcc.target/gcn/, there is a problem since gcn-run > >> currently cannot set the stack size correctly in DejaGnu testing, so I > >> have made it a compile test for now - it is still useful to check that > >> calls to the correct functions are being made. The runtime correctness > >> is still covered by the libgomp test. > >> > >> Okay for trunk? > > > > The main part of the patch is OK, with the small changes below. > > > > Others have pointed out that "omp declare simd" exists, but you and I > > have been all through that verbally, long ago, and as Tobias says the > > offload compiler cannot rely on markup in the host compiler's header > > files to solve this problem. > > For what it's worth, I am currently working on enabling "omp declare > simd" for SVE and more importantly teaching GCC to use "omp declare > variant"'s with simd construct's as simdclones during autovect. This > gives a bit more control on what simdclones you advertise as available. > I hope to have some RFC's on here soon. I obviously am not familiar with > your constraints but just wanted to let you know. We can use "omp declare target sim" (or whatever the exact form is) to create SIMD clones of user functions, but this doesn't work for libm functions (or any library) unless the header file for both host and offload device have matching markup. Given that the x86_64 Linux host is using Glibc and the offload device compiler is using Newlib this is not likely to be the case. I suppose the variants thing could something about this, but with this patch we don't need it to, for now. Andrew