From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=QnXE=66=loongson.cn=chenglulu@sourceware.org>
Received: from loongson.cn (mail.loongson.cn [114.242.206.163])
	by sourceware.org (Postfix) with ESMTP id 0BCCB3858D35
	for <gcc-patches@gcc.gnu.org>; Mon,  6 Mar 2023 07:21:41 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0BCCB3858D35
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn
Received: from loongson.cn (unknown [10.20.4.52])
	by gateway (Coremail) with SMTP id _____8DxEzSClAVkdq8IAA--.10845S3;
	Mon, 06 Mar 2023 15:21:39 +0800 (CST)
Received: from [10.20.4.52] (unknown [10.20.4.52])
	by localhost.localdomain (Coremail) with SMTP id AQAAf8Bxib2ClAVk_SxMAA--.1629S2;
	Mon, 06 Mar 2023 15:21:38 +0800 (CST)
Subject: Re: [PATCH v2] LoongArch: Stop -mfpu from silently breaking ABI
 [PR109000]
To: Xi Ruoyao <xry111@xry111.site>, gcc-patches@gcc.gnu.org
Cc: WANG Xuerui <i@xen0n.name>, Chenghua Xu <xuchenghua@loongson.cn>,
 Yujie Yang <yangyujie@loongson.cn>
References: <20230303081658.6383-1-xry111@xry111.site>
From: Lulu Cheng <chenglulu@loongson.cn>
Message-ID: <88ce32ee-75e1-6a93-df99-3b75b391cd0f@loongson.cn>
Date: Mon, 6 Mar 2023 15:21:38 +0800
User-Agent: Mozilla/5.0 (X11; Linux mips64; rv:68.0) Gecko/20100101
 Thunderbird/68.7.0
MIME-Version: 1.0
In-Reply-To: <20230303081658.6383-1-xry111@xry111.site>
Content-Type: text/plain; charset=gbk; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: en-US
X-CM-TRANSID:AQAAf8Bxib2ClAVk_SxMAA--.1629S2
X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/
X-Coremail-Antispam: 1Uk129KBjvJXoWxuw4DZFWDKw4kAw1xKFykKrg_yoWxtFykpa
	y7Arn0kFWxGFZ7WF1kWa45Xr4avr4xGrW3u3ZIq3yFyr1ftryIq3W8KF17CFWUJ3WUJryS
	9FWruayj93W0vwUanT9S1TB71UUUUUDqnTZGkaVYY2UrUUUUj1kv1TuYvTs0mT0YCTnIWj
	qI5I8CrVACY4xI64kE6c02F40Ex7xfYxn0WfASr-VFAUDa7-sFnT9fnUUIcSsGvfJTRUUU
	bI8YFVCjjxCrM7AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s
	1l1IIY67AEw4v_Jr0_Jr4l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv
	wVC0I7IYx2IY67AKxVW5JVW7JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwA2z4
	x0Y4vEx4A2jsIE14v26r4UJVWxJr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r4UJVWxJr1l
	e2I262IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2xF0cIa020Ex4CE44I27wAqx4xG64xvF2
	IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jrv_JF1lYx0Ex4A2jsIE14v26r1j6r4U
	McvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvEwIxGrwCYjI0SjxkI62AI1cAE67vIY487Mx
	AIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_
	Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwI
	xGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8
	JwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcV
	C2z280aVCY1x0267AKxVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjxUwmhFDUUUU
X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_SHORT,MIME_CHARSET_FARAWAY,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>


在 2023/3/3 下午4:16, Xi Ruoyao 写道:
> In the toolchain convention, we describe -mfpu= as:
>
> "Selects the allowed set of basic floating-point instructions and
> registers. This option should not change the FP calling convention
> unless it's necessary."
>
> Though not explicitly stated, the rationale of this rule is to allow
> combinations like "-mabi=lp64s -mfpu=64".  This will be useful for
> running applications with LP64S/F ABI on a double-float-capable
> LoongArch hardware and using a math library with LP64S/F ABI but native
> double float HW instructions, for a better performance.
>
> And now a case in Linux kernel has again proven the usefulness of this
> kind of combination.  The AMDGPU DCN kernel driver needs to perform some
> floating-point operation, but the entire kernel uses LP64S ABI.  So the
> translation units of the AMDGPU DCN driver need to be compiled with
> -mfpu=64 (the kernel lacks soft-FP routines in libgcc), but -mabi=lp64s
> (or you can't link it with the other part of the kernel).
>
> Unfortunately, currently GCC uses TARGET_{HARD,SOFT,DOUBLE}_FLOAT to
> determine the floating calling convention.  This causes "-mfpu=64"
> silently allow using $fa* to pass parameters and return values EVEN IF
> -mabi=lp64s is used.  To make things worse, the generated object file
> has SOFT-FLOAT set in the eflags field so the linker will happily link
> it with other LP64S ABI object files, but obviously this will lead to
> bad results at runtime.  And for now all loongarch64 CPU models (-march
> settings) implies -mfpu=64 on by default, so the issue makes a single
> "-mabi=lp64s" option basically broken (fortunately most projects for eg
> the Linux kernel have used -msoft-float which implies both -mabi=lp64s
> and -mfpu=none as we've recommended in the toolchain convention doc).
>
> The fix is simple: use TARGET_*_FLOAT_ABI instead.
>
> I consider this a bug fix: the behavior difference from the toolchain
> convention doc is a bug, and generating object files with SOFT-FLOAT
> flag but parameters/return values passed through FPRs is definitely a
> bug.
>
> Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk and
> release/gcc-12 branch?

LGTM!

Thanks!

>
> gcc/ChangeLog:
>
> 	PR target/109000
> 	* config/loongarch/loongarch.h (FP_RETURN): Use
> 	TARGET_*_FLOAT_ABI instead of TARGET_*_FLOAT.
> 	(UNITS_PER_FP_ARG): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> 	PR target/109000
> 	* gcc.target/loongarch/flt-abi-isa-1.c: New test.
> 	* gcc.target/loongarch/flt-abi-isa-2.c: New test.
> 	* gcc.target/loongarch/flt-abi-isa-3.c: New test.
> 	* gcc.target/loongarch/flt-abi-isa-4.c: New test.
> ---
>   gcc/config/loongarch/loongarch.h                   |  4 ++--
>   gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c | 14 ++++++++++++++
>   gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c | 10 ++++++++++
>   gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c |  9 +++++++++
>   gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c | 10 ++++++++++
>   5 files changed, 45 insertions(+), 2 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c
>   create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c
>   create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c
>   create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c
>
> diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
> index f4e903d46bb..f8167875646 100644
> --- a/gcc/config/loongarch/loongarch.h
> +++ b/gcc/config/loongarch/loongarch.h
> @@ -676,7 +676,7 @@ enum reg_class
>      point values.  */
>   
>   #define GP_RETURN (GP_REG_FIRST + 4)
> -#define FP_RETURN ((TARGET_SOFT_FLOAT) ? GP_RETURN : (FP_REG_FIRST + 0))
> +#define FP_RETURN ((TARGET_SOFT_FLOAT_ABI) ? GP_RETURN : (FP_REG_FIRST + 0))
>   
>   #define MAX_ARGS_IN_REGISTERS 8
>   
> @@ -1154,6 +1154,6 @@ struct GTY (()) machine_function
>   /* The largest type that can be passed in floating-point registers.  */
>   /* TODO: according to mabi.  */
>   #define UNITS_PER_FP_ARG  \
> -  (TARGET_HARD_FLOAT ? (TARGET_DOUBLE_FLOAT ? 8 : 4) : 0)
> +  (TARGET_HARD_FLOAT_ABI ? (TARGET_DOUBLE_FLOAT_ABI ? 8 : 4) : 0)
>   
>   #define FUNCTION_VALUE_REGNO_P(N) ((N) == GP_RETURN || (N) == FP_RETURN)
> diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c
> new file mode 100644
> index 00000000000..1c9490f6a87
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mabi=lp64d -mfpu=64 -march=loongarch64 -O2" } */
> +/* { dg-final { scan-assembler "frecip\\.d" } } */
> +/* { dg-final { scan-assembler-not "movgr2fr\\.d" } } */
> +/* { dg-final { scan-assembler-not "movfr2gr\\.d" } } */
> +
> +/* FPU is used for calculation and FPR is used for arguments and return
> +   values.  */
> +
> +double
> +t (double x)
> +{
> +  return 1.0 / x;
> +}
> diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c
> new file mode 100644
> index 00000000000..0580fd65d3a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mabi=lp64s -mfpu=64 -march=loongarch64 -O2" } */
> +/* { dg-final { scan-assembler "frecip\\.d" } } */
> +/* { dg-final { scan-assembler "movgr2fr\\.d" } } */
> +/* { dg-final { scan-assembler "movfr2gr\\.d" } } */
> +
> +/* FPU is used for calculation but FPR cannot be used for arguments and
> +   return values.  */
> +
> +#include "flt-abi-isa-1.c"
> diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c
> new file mode 100644
> index 00000000000..16a926f57a1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mabi=lp64s -mfpu=none -march=loongarch64 -O2" } */
> +/* { dg-final { scan-assembler-not "frecip\\.d" } } */
> +/* { dg-final { scan-assembler-not "movgr2fr\\.d" } } */
> +/* { dg-final { scan-assembler-not "movfr2gr\\.d" } } */
> +
> +/* FPU cannot be used at all.  */
> +
> +#include "flt-abi-isa-1.c"
> diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c
> new file mode 100644
> index 00000000000..43b579c3fac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msoft-float -march=loongarch64 -O2" } */
> +/* { dg-final { scan-assembler-not "frecip\\.d" } } */
> +/* { dg-final { scan-assembler-not "movgr2fr\\.d" } } */
> +/* { dg-final { scan-assembler-not "movfr2gr\\.d" } } */
> +
> +/* -msoft-float implies both -mabi=lp64s and -mfpu=none.
> +   FPU cannot be used at all.  */
> +
> +#include "flt-abi-isa-1.c"