From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=L3Je=7C=arm.com=richard.sandiford@sourceware.org>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
	by sourceware.org (Postfix) with ESMTP id 7461A3858D1E
	for <gcc-patches@gcc.gnu.org>; Fri, 10 Mar 2023 11:50:41 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7461A3858D1E
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 983C5C14;
	Fri, 10 Mar 2023 03:51:24 -0800 (PST)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72])
	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 499253F5A1;
	Fri, 10 Mar 2023 03:50:40 -0800 (PST)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Jakub Jelinek <jakub@redhat.com>
Mail-Followup-To: Jakub Jelinek <jakub@redhat.com>,Richard Earnshaw <richard.earnshaw@arm.com>,  Kyrylo Tkachov <kyrylo.tkachov@arm.com>,  Jason Merrill <jason@redhat.com>,  gcc-patches@gcc.gnu.org, richard.sandiford@arm.com
Cc: Richard Earnshaw <richard.earnshaw@arm.com>,  Kyrylo Tkachov <kyrylo.tkachov@arm.com>,  Jason Merrill <jason@redhat.com>,  gcc-patches@gcc.gnu.org
Subject: Re: AArch64 bfloat16 mangling
References: <Y9eS7Yt5uIVIyCzZ@tucnak> <mptwn53hb9w.fsf@arm.com>
	<Y9o+dJnVhavp+Edg@tucnak> <mpt8rg526fw.fsf@arm.com>
	<ZArsNWoxC315JyOQ@tucnak> <mpth6utyp2h.fsf@arm.com>
	<ZAsU2hwy2IonHJ3Q@tucnak>
Date: Fri, 10 Mar 2023 11:50:39 +0000
In-Reply-To: <ZAsU2hwy2IonHJ3Q@tucnak> (Jakub Jelinek's message of "Fri, 10
	Mar 2023 12:30:34 +0100")
Message-ID: <mpto7p0ygds.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Status: No, score=-27.4 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

Jakub Jelinek <jakub@redhat.com> writes:
> On Fri, Mar 10, 2023 at 08:43:02AM +0000, Richard Sandiford wrote:
>> > So, either __bf16 should be also extended floating-point type
>> > like decltype (0.0bf16) and std::bfloat16_t and in that case
>> > it is fine if it mangles u6__bf16, or __bf16 will be a distinct
>> > type from the latter two,
>> 
>> Yeah, the former is what I meant.  The intention is that __bf16 and
>> std::bfloat16_t are the same type, not distinct types.
>
> Ok, in that case here is totally untested patch on top of
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606398.html
> which is also needed (for aarch64 of course the i386 parts of the
> patch which have been acked already don't matter but the 2 libgcc
> new files are needed and the optabs change is too).

OK for the rest of that.

> The reason why __floatdibf and __floatundibf are needed on aarch64
> and not on x86 is that the latter has optabs for DI -> XF conversions
> and so for DI -> BF uses DI -> XF -> BF where the first conversion
> doesn't round/truncate anything.  While on aarch64 DI -> TF conversion
> where TF is the narrowed mode which can hold all DI values exactly
> is done using a libcall and so GCC emits direct DI -> BF conversions.
>
> Will test it momentarily (including the patch it depends on):
>
> 2023-03-10  Jakub Jelinek  <jakub@redhat.com>
>
> gcc/
> 	* config/aarch64/aarch64.h (aarch64_bf16_type_node): Remove.
> 	(aarch64_bf16_ptr_type_node): Adjust comment.
> 	* config/aarch64/aarch64.cc (aarch64_gimplify_va_arg_expr): Use
> 	bfloat16_type_node rather than aarch64_bf16_type_node.
> 	(aarch64_libgcc_floating_mode_supported_p,
> 	aarch64_scalar_mode_supported_p): Also support BFmode.
> 	(aarch64_invalid_conversion, aarch64_invalid_unary_op): Remove.
> 	aarch64_invalid_binary_op): Remove BFmode related rejections.
> 	(TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP): Don't redefine.
> 	* config/aarch64/aarch64-builtins.cc (aarch64_bf16_type_node): Remove.
> 	(aarch64_int_or_fp_type): Use bfloat16_type_node rather than
> 	aarch64_bf16_type_node.
> 	(aarch64_init_simd_builtin_types): Likewise.
> 	(aarch64_init_bf16_types): Likewise.  Don't create bfloat16_type_node,
> 	which is created in tree.cc already.
> 	* config/aarch64/aarch64-sve-builtins.def (svbfloat16_t): Likewise.
> gcc/testsuite/
> 	* gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_opt_n_1.c:
> 	Don't expect one __bf16 related error.
> libgcc/
> 	* config/aarch64/t-softfp (softfp_extensions): Add bfsf.
> 	(softfp_truncations): Add tfbf dfbf sfbf hfbf.
> 	(softfp_extras): Add floatdibf floatundibf floattibf floatuntibf.
> 	* config/aarch64/libgcc-softfp.ver (GCC_13.0.0): Export
> 	__extendbfsf2 and __trunc{s,d,t,h}fbf2.
> 	* config/aarch64/sfp-machine.h (_FP_NANFRAC_B, _FP_NANSIGN_B): Define.
> 	* soft-fp/floatundibf.c: New file.
> 	* soft-fp/floatdibf.c: New file.
> libstdc++-v3/
> 	* config/abi/pre/gnu.ver (CXXABI_1.3.14): Also export __bf16 tinfos
> 	if it isn't mangled as DF16b but u6__bf16.

Thanks, looks great.  Nice to see all the - lines. :)

A naive question:

> --- libgcc/config/aarch64/t-softfp.jj	2022-11-14 13:35:34.527155682 +0100
> +++ libgcc/config/aarch64/t-softfp	2023-03-10 12:19:58.668882041 +0100
> @@ -1,9 +1,10 @@
>  softfp_float_modes := tf
>  softfp_int_modes := si di ti
> -softfp_extensions := sftf dftf hftf
> -softfp_truncations := tfsf tfdf tfhf
> +softfp_extensions := sftf dftf hftf bfsf
> +softfp_truncations := tfsf tfdf tfhf tfbf dfbf sfbf hfbf

Is bfsf used for conversions in which sf is the ultimate target,
as opposed to operations that convert bf to sf and then do something
with the sf?  And so the libfunc is needed to raise exceptions, which in
more complex operations can be left to the following sf operation?

Do we still optimise to a shift for -ffinite-math-only?

Assuming so, the patch LGTM.  I'm not familiar enough with softfloat
to do a meaningful review of those parts, and I'm taking the versioning
changes on faith. :)

Thanks,
Richard

>  softfp_exclude_libgcc2 := n
> -softfp_extras := fixhfti fixunshfti floattihf floatuntihf
> +softfp_extras := fixhfti fixunshfti floattihf floatuntihf \
> +		 floatdibf floatundibf floattibf floatuntibf
>  
>  TARGET_LIBGCC2_CFLAGS += -Wno-missing-prototypes
>  
> --- libgcc/config/aarch64/libgcc-softfp.ver.jj	2023-01-16 11:52:16.633725959 +0100
> +++ libgcc/config/aarch64/libgcc-softfp.ver	2023-03-10 12:11:44.144082714 +0100
> @@ -26,3 +26,16 @@ GCC_11.0 {
>    __mulhc3
>    __trunctfhf2
>  }
> +
> +%inherit GCC_13.0.0 GCC_11.0.0
> +GCC_13.0.0 {
> +  __extendbfsf2
> +  __floatdibf
> +  __floattibf
> +  __floatundibf
> +  __floatuntibf
> +  __truncdfbf2
> +  __truncsfbf2
> +  __trunctfbf2
> +  __trunchfbf2
> +}
> --- libgcc/config/aarch64/sfp-machine.h.jj	2023-01-16 11:52:16.633725959 +0100
> +++ libgcc/config/aarch64/sfp-machine.h	2023-03-10 11:49:35.985435685 +0100
> @@ -43,10 +43,12 @@ typedef int __gcc_CMPtype __attribute__
>  #define _FP_DIV_MEAT_Q(R,X,Y)	_FP_DIV_MEAT_2_udiv(Q,R,X,Y)
>  
>  #define _FP_NANFRAC_H		((_FP_QNANBIT_H << 1) - 1)
> +#define _FP_NANFRAC_B		((_FP_QNANBIT_B << 1) - 1)
>  #define _FP_NANFRAC_S		((_FP_QNANBIT_S << 1) - 1)
>  #define _FP_NANFRAC_D		((_FP_QNANBIT_D << 1) - 1)
>  #define _FP_NANFRAC_Q		((_FP_QNANBIT_Q << 1) - 1), -1
>  #define _FP_NANSIGN_H		0
> +#define _FP_NANSIGN_B		0
>  #define _FP_NANSIGN_S		0
>  #define _FP_NANSIGN_D		0
>  #define _FP_NANSIGN_Q		0
> --- libgcc/soft-fp/floatundibf.c.jj	2023-03-10 12:10:40.143014939 +0100
> +++ libgcc/soft-fp/floatundibf.c	2023-03-10 12:11:07.387618096 +0100
> @@ -0,0 +1,45 @@
> +/* Software floating-point emulation.
> +   Convert a 64bit unsigned integer to bfloat16
> +   Copyright (C) 2007-2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   In addition to the permissions in the GNU Lesser General Public
> +   License, the Free Software Foundation gives you unlimited
> +   permission to link the compiled version of this file into
> +   combinations with other programs, and to distribute those
> +   combinations without any restriction coming from the use of this
> +   file.  (The Lesser General Public License restrictions do apply in
> +   other respects; for example, they cover modification of the file,
> +   and distribution when not linked into a combine executable.)
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include "soft-fp.h"
> +#include "brain.h"
> +
> +BFtype
> +__floatundibf (UDItype i)
> +{
> +  FP_DECL_EX;
> +  FP_DECL_B (A);
> +  BFtype a;
> +
> +  FP_INIT_ROUNDMODE;
> +  FP_FROM_INT_B (A, i, DI_BITS, UDItype);
> +  FP_PACK_RAW_B (a, A);
> +  FP_HANDLE_EXCEPTIONS;
> +
> +  return a;
> +}
> --- libgcc/soft-fp/floatdibf.c.jj	2023-03-10 12:08:56.752520872 +0100
> +++ libgcc/soft-fp/floatdibf.c	2023-03-10 12:09:56.934644288 +0100
> @@ -0,0 +1,45 @@
> +/* Software floating-point emulation.
> +   Convert a 64bit signed integer to bfloat16
> +   Copyright (C) 2007-2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   In addition to the permissions in the GNU Lesser General Public
> +   License, the Free Software Foundation gives you unlimited
> +   permission to link the compiled version of this file into
> +   combinations with other programs, and to distribute those
> +   combinations without any restriction coming from the use of this
> +   file.  (The Lesser General Public License restrictions do apply in
> +   other respects; for example, they cover modification of the file,
> +   and distribution when not linked into a combine executable.)
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include "soft-fp.h"
> +#include "brain.h"
> +
> +BFtype
> +__floatdibf (DItype i)
> +{
> +  FP_DECL_EX;
> +  FP_DECL_B (A);
> +  BFtype a;
> +
> +  FP_INIT_ROUNDMODE;
> +  FP_FROM_INT_B (A, i, DI_BITS, UDItype);
> +  FP_PACK_RAW_B (a, A);
> +  FP_HANDLE_EXCEPTIONS;
> +
> +  return a;
> +}
> --- libstdc++-v3/config/abi/pre/gnu.ver.jj	2023-03-07 18:57:13.135213321 +0100
> +++ libstdc++-v3/config/abi/pre/gnu.ver	2023-03-10 11:52:27.870929478 +0100
> @@ -2828,6 +2828,9 @@ CXXABI_1.3.14 {
>      _ZTIDF[0-9]*[_bx];
>      _ZTIPDF[0-9]*[_bx];
>      _ZTIPKDF[0-9]*[_bx];
> +    _ZTIu6__bf16;
> +    _ZTIPu6__bf16;
> +    _ZTIPKu6__bf16;
>  
>  } CXXABI_1.3.13;
>  
>
>
> 	Jakub