From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 471DD3858434 for ; Fri, 10 Mar 2023 15:35:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 471DD3858434 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678462559; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=WzP0/RkoDJayRFE5KcX1ia3ubQVpcWbSbN31J7phbX0=; b=Z3gOL5Kn3zheATbxRxyOMs/b8PiLtqK6fbcYvxStWmAGLz2Y33qOZToSHoLoFGE7p2ECJO ZBtApVHLZLBU4UMJHAQZzxMYJlRSHea5JsPBkOn2HN2kHQCbtCYXEYt3Lq1+P0KZ+HTl9g EiFP9Uy3Q//7YZFeqfkR0T5Spuuih3c= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-500-MMxnVExZO4mFe0wkHQyevg-1; Fri, 10 Mar 2023 10:35:53 -0500 X-MC-Unique: MMxnVExZO4mFe0wkHQyevg-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1C6DB858F0E; Fri, 10 Mar 2023 15:35:53 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.16]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D14CC492B04; Fri, 10 Mar 2023 15:35:52 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 32AFZo8L895211 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 10 Mar 2023 16:35:50 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 32AFZn2r895209; Fri, 10 Mar 2023 16:35:49 +0100 Date: Fri, 10 Mar 2023 16:35:49 +0100 From: Jakub Jelinek To: Richard Earnshaw , Kyrylo Tkachov , Jason Merrill , gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: Re: AArch64 bfloat16 mangling Message-ID: Reply-To: Jakub Jelinek References: MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Mar 10, 2023 at 11:50:39AM +0000, Richard Sandiford wrote: > > Will test it momentarily (including the patch it depends on): Note, testing still pending, I'm testing in a Fedora scratch build and that is quite slow (lto bootstrap and the like). > A naive question: > > > --- libgcc/config/aarch64/t-softfp.jj 2022-11-14 13:35:34.527155682 +0100 > > +++ libgcc/config/aarch64/t-softfp 2023-03-10 12:19:58.668882041 +0100 > > @@ -1,9 +1,10 @@ > > softfp_float_modes := tf > > softfp_int_modes := si di ti > > -softfp_extensions := sftf dftf hftf > > -softfp_truncations := tfsf tfdf tfhf > > +softfp_extensions := sftf dftf hftf bfsf > > +softfp_truncations := tfsf tfdf tfhf tfbf dfbf sfbf hfbf > > Is bfsf used for conversions in which sf is the ultimate target, > as opposed to operations that convert bf to sf and then do something > with the sf? And so the libfunc is needed to raise exceptions, which in > more complex operations can be left to the following sf operation? > > Do we still optimise to a shift for -ffinite-math-only? Reminds me I should have added testcase coverage for PR107703, will post it momentarily. But, consider say: template [[gnu::noipa]] T cvt (F f) { return T (F (f)); } void foo () { cvt <_Float32, __bf16> (0.0bf16); cvt <_Float64, __bf16> (0.0bf16); cvt <_Float128, __bf16> (0.0bf16); cvt (0.0bf16); cvt (0.0bf16); cvt (0.0bf16); cvt (0.0bf16); cvt <__int128, __bf16> (0.0bf16); } This emits on x86_64 -O2: /usr/src/gcc/obj/gcc/cc1plus -quiet -O2 1111.C; grep call.*__ 1111.s call __extendbfsf2 call __extendbfsf2 call __extendbfsf2 call __extendsftf2 call __fixsfti where the first call is in cvt <_Float32, __bf16> is really needed, admittedly the second 2 calls could be replaced by shifts but aren't right now (we expand BF -> DF as BF -> SF -> DF and because sNaN would be already diagnosed on the SF -> DF conversion if BF -> SF is done with shift, I think it would be ok; similarly for BF -> TF). All the others (BF -> ?I) are expanded as BF -> SF using shift and then SF -> ?I. With -O2 -ffast-math /usr/src/gcc/obj/gcc/cc1plus -quiet -O2 -ffast-math 1111.C; grep call.*__ 1111.s call __extendsftf2 call __fixsfti so all the BF -> SF conversions are then done using shifts. And aarch64 is exactly the same: ./cc1plus -quiet -nostdinc -O2 1111.C; grep bl.*__[ef] 1111.s bl __extendbfsf2 bl __extendbfsf2 bl __extendbfsf2 bl __extendsftf2 bl __fixsfti ./cc1plus -quiet -nostdinc -O2 -ffast-math 1111.C; grep bl.*__[ef] 1111.s bl __extendsftf2 bl __fixsfti > Assuming so, the patch LGTM. I'm not familiar enough with softfloat > to do a meaningful review of those parts, and I'm taking the versioning > changes on faith. :) The soft-fp new files (in both patches) are fairly mechanical: for i in float{,un}{d,t}isf.c; do \ sed 's/IEEE single/bfloat16/;s/single/brain/;s/SFtype/BFtype/;s/_S /_B /;s/sf /bf /' \ $i `echo $i | sed 's/sf.c/bf.c/'` done (well, I've created them by hand, so the Copyright lines differ, but otherwise they are identical to what the above script would create). So, there are no smarts in those, the soft-fp library already can handle those formats. Jakub