From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31352 invoked by alias); 7 Jul 2015 17:17:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 31339 invoked by uid 89); 7 Jul 2015 17:17:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 07 Jul 2015 17:17:14 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-3-A2_mezvuS1imZqbHqDrBuA-1; Tue, 07 Jul 2015 18:17:09 +0100 Received: from [10.2.207.65] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 7 Jul 2015 18:17:08 +0100 Message-ID: <559C0995.6010105@arm.com> Date: Tue, 07 Jul 2015 17:17:00 -0000 From: Alan Lawrence User-Agent: Thunderbird 2.0.0.24 (X11/20101213) MIME-Version: 1.0 To: Kyrill Tkachov CC: "gcc-patches@gcc.gnu.org" , Ramana Radhakrishnan , Tejas Belagod , Richard Earnshaw Subject: Re: [PATCH 3/16][ARM] Add float16x4_t intrinsics References: <559BC75A.1080606@arm.com> <559BCF8A.4090704@arm.com> <559BFCBD.4080806@arm.com> <559BFF83.20306@arm.com> <559C03D0.8010104@arm.com> In-Reply-To: <559C03D0.8010104@arm.com> X-MC-Unique: A2_mezvuS1imZqbHqDrBuA-1 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-07/txt/msg00538.txt.bz2 Kyrill Tkachov wrote: > On 07/07/15 17:34, Alan Lawrence wrote: >> Kyrill Tkachov wrote: >>> On 07/07/15 14:09, Kyrill Tkachov wrote: >>>> Hi Alan, >>>> >>>> On 07/07/15 13:34, Alan Lawrence wrote: >>>>> As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01335.html >>>> For some context, the reference for these is at: >>>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/IHI0073A_arm= _neon_intrinsics_ref.pdf >>>> >>>> This patch is ok once you and Charles decide on how to proceed with th= e two prerequisites. >>> On second thought, the ACLE document at http://infocenter.arm.com/help/= topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf >>> >>> says in 12.2.1: >>> "float16 types are only available when the __fp16 type is defined, i.e.= when supported by the hardware" >> However, we support __fp16 whenever the user specifies -mfp16-format=3Di= eee or >> -mfp16-format=3Dalternative, regardless of whether we have hardware supp= ort or not. >> >> (Without hardware support, gcc generates calls to __gnu_f2h_ieee or >> __gnu_f2h_alternative instead of vcvtb.f16.f32, and __gnu_h2f_ieee or >> __gnu_h2f_alternative instead of vcvtb.f32.f16. However, there is no way= to >> support __fp16 just using those hardware instructions without caring abo= ut which >> format is in use.) >=20 > Hmmm... In my opinion intrinsics should aim to map to instructions rather= than go away and > call library functions, but this is the existing functionality > that current users might depend on :( Sorry - to clarify: currently we generate __gnu_f2h_ieee / __gnu_h2f_ieee, = to=20 convert between single __fp16 and 'float' values, when there is no HW. Gene= ral=20 operations on scalar __fp16 values are performed by converting to float,=20 performing operations on float, and converting back. The __fp16 type is=20 available and "usable" without HW support, but only when -mfp16-format is s= pecified. (The existing) intrinsics operating on float16x[48] vectors (converting to/= from=20 float32x4) are *not* available without hardware support; these intrinsics *= are*=20 available without specifying -mfp16-format. ACLE (4.1.2) allows toolchains to provide __fp16 when not implemented in HW= ,=20 even if this is not required. > CC'ing the ARM maintainers and Tejas for an ACLE perspective. > I think that we'd want to gate the definition of __fp16 on hardware avail= ability as well > (the -mfpu option) rather than just arm_fp16_format but I'm not sure of t= he impact this will have > on existing users. Sure....but do we require -mfpu *and* -mfp16-format? s/and/or/ ? Do we re= quire=20 -mfp16-format for float16x[48] intrinsics, or allow format-agnostic code (a= s HW=20 support allows us to!)? I don't have very strong opinions as to which way we should go, I merely tr= ied=20 to be consistent with the existing codebase, and to support as much code as= =20 possible, although I agree I ignored cases where defining functions unexpec= tedly=20 might cause problems. Cheers, Alan