From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=q4Qj=AL=gmail.com=jeffreyalaw@sourceware.org>
Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434])
	by sourceware.org (Postfix) with ESMTPS id 7CF533857714
	for <gcc-patches@gcc.gnu.org>; Thu, 20 Apr 2023 17:40:09 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7CF533857714
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-63b4dfead1bso1197830b3a.3
        for <gcc-patches@gcc.gnu.org>; Thu, 20 Apr 2023 10:40:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1682012407; x=1684604407;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :from:to:cc:subject:date:message-id:reply-to;
        bh=sOLYbCXzDoL1nqdnL+F5gCuRyEkUyxJf8OSg7zDQSYE=;
        b=aLMk89eQOpYXUlJtg2RV8XKCq0mWXmxMh4aXc08/Du3icu5B/0Gwin0+32AwIkTynW
         Cx3TUe3MSqYKo9YYfK3mQuM5NQI53nbtT76a6bQfmuiy7cMTu+5waWLyAEsp2B7/h+K9
         bNOeUjsI1gPmKGdS9xf0PXf3RpFtBWC5+83YJciKWrAIRUCHnjAwa2kgPwgDmS/gQMIp
         rDO8sRQLYpur5/Ye8IVs2+frDYz/fALNpRQ8mVTqXz+gaNvZEb9njXfUu5IP8xg03Lep
         r2PCxkfZ/N9O92Sn1ESu0G5FYAPO2QVxJJOAeoxDngodRWWNAaUmpd/2Vx3HVBuSMQ0M
         Q2aw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1682012407; x=1684604407;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=sOLYbCXzDoL1nqdnL+F5gCuRyEkUyxJf8OSg7zDQSYE=;
        b=WciOJT3OxIe/3xsZEcDL5bO3E2p9ujnsJJRMwYizvhjW+fGEtX6J32DzqW0f8GETnq
         ehndaxcItg2Ox3kB/W2vIxrE1IZ09FIvIQwNoGlVD4zD0hToEcNCdK1UcdQViDWsY82X
         0gvIKwXBAGySzwq+r6g/8Jzyf8s0Xz8s7kRBsBm8G4iZM5C2LD/xMmtIeowewiC60ue1
         KyGOJUyvrUev6xuPOqHYYm5Cvs+inTLilZudIXZqT60pRlHIj4lAdfsfZPaWn8gbtKjb
         YP9GEoSBeQdWcJEdO0q4esejWlkdrV3FT5IcdWkXHrxPPJp9McLG+7Aya1+62AEgpBIp
         VOsQ==
X-Gm-Message-State: AAQBX9c/3OEOrqdpAMMBlZGdAL/8s1Lw0kZ5WrulsRNotXen3YlaWoYi
	H1uApTsC7tdnJ9DJt8tF5r0=
X-Google-Smtp-Source: AKy350ZYIZ3Q09SBDDZ4A8d9hq2uk32bK+LfHSs2jAE29OtfrZ8qybykWdz2L1IBzkSsd+bT7WxeDw==
X-Received: by 2002:a05:6a00:10c9:b0:63b:8b47:453c with SMTP id d9-20020a056a0010c900b0063b8b47453cmr2613743pfu.1.1682012407070;
        Thu, 20 Apr 2023 10:40:07 -0700 (PDT)
Received: from ?IPV6:2601:681:8600:13d0::99f? ([2601:681:8600:13d0::99f])
        by smtp.gmail.com with ESMTPSA id g1-20020a056a00078100b0063b675f01a2sm1510634pfu.26.2023.04.20.10.40.06
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Thu, 20 Apr 2023 10:40:06 -0700 (PDT)
Message-ID: <f18bfa1e-7d32-e49b-aa86-6d365078264b@gmail.com>
Date: Thu, 20 Apr 2023 11:40:05 -0600
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.9.1
Subject: Re: [PATCH v6] RISC-V: Add support for experimental zfa extension.
Content-Language: en-US
To: Jin Ma <jinma@linux.alibaba.com>, gcc-patches@gcc.gnu.org
Cc: kito.cheng@sifive.com, kito.cheng@gmail.com, palmer@dabbelt.com
References: <20230310124053.164-1-jinma@linux.alibaba.com>
From: Jeff Law <jeffreyalaw@gmail.com>
In-Reply-To: <20230310124053.164-1-jinma@linux.alibaba.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>


On 3/10/23 05:40, Jin Ma via Gcc-patches wrote:
> This patch adds the 'Zfa' extension for riscv, which is based on:
>   https://github.com/riscv/riscv-isa-manual/commit/d74d99e22d5f68832f70982d867614e2149a3bd7
> latest 'Zfa' change on the master branch of the RISC-V ISA Manual as
> of this writing.
> 
> The Wiki Page (details):
>   https://github.com/a4lg/binutils-gdb/wiki/riscv_zfa
> 
> The binutils-gdb for 'Zfa' extension:
>   https://sourceware.org/pipermail/binutils/2022-September/122938.html
> 
> Implementation of zfa extension on LLVM:
>    https://reviews.llvm.org/rGc0947dc44109252fcc0f68a542fc6ef250d4d3a9
> 
> There are three points that need to be discussed here.
> 1. According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
>    accelerate the processing of JavaScript Numbers.", so it seems that no implementation
>    is required in the compiler.
> 2. The FROUND and FROUNDN instructions in this patch use related functions in the math
>    library, such as round, floor, ceil, etc. Since there is no interface for half-precision in
>    the math library, the instructions FROUN D.H and FROUNDN X.H have not been implemented for
>    the time being. Is it necessary to add a built-in interface belonging to riscv such as
>   __builtin_roundhf or __builtin_roundf16 to generate half floating point instructions?
> 3. As far as I know, FMINM and FMAXM instructions correspond to C23 library function fminimum
>    and fmaximum. Therefore, I have not dealt with such instructions for the time being, but have
>    simply implemented the pattern of fminm<hf\sf\df>3 and fmaxm<hf\sf\df>3. Is it necessary to
>    add a built-in interface belonging to riscv such as__builtin_fminm to generate half
>    floating-point instructions?
> 
> gcc/ChangeLog:
> 
> 	* common/config/riscv/riscv-common.cc: Add zfa extension.
> 	* config/riscv/constraints.md (Zf): Constrain the floating point number that the FLI instruction can load.
> 	* config/riscv/iterators.md (round_pattern): New.
> 	* config/riscv/predicates.md: Predicate the floating point number that the FLI instruction can load.
> 	* config/riscv/riscv-opts.h (MASK_ZFA): New.
> 	(TARGET_ZFA): New.
> 	* config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): Get the index of the
>            floating-point number that the FLI instruction can load.
> 	* config/riscv/riscv.cc (find_index_in_array): New.
> 	(riscv_float_const_rtx_index_for_fli): New.
> 	(riscv_cannot_force_const_mem): Likewise.
> 	(riscv_const_insns): Likewise.
> 	(riscv_legitimize_const_move): Likewise.
> 	(riscv_split_64bit_move_p): Exclude floating point numbers that can be loaded by FLI instructions.
> 	(riscv_output_move): Likewise.
> 	(riscv_memmodel_needs_release_fence): Likewise.
> 	(riscv_print_operand): Likewise.
> 	(riscv_secondary_memory_needed): Likewise.
> 	* config/riscv/riscv.h (GP_REG_RTX_P): New.
> 	* config/riscv/riscv.md (fminm<mode>3): New.
> 	(fmaxm<mode>3): New.
> 	(<round_pattern><ANYF:mode>2): New.
> 	(rint<ANYF:mode>2): New.
> 	(f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa): New.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fleq-fltq.c: New test.
> 	* gcc.target/riscv/zfa-fli-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fli-zfh.c: New test.
> 	* gcc.target/riscv/zfa-fli.c: New test.
> 	* gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fround-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fround.c: New test.
> ---
>   gcc/common/config/riscv/riscv-common.cc       |   4 +
>   gcc/config/riscv/constraints.md               |   7 +
>   gcc/config/riscv/iterators.md                 |   5 +
>   gcc/config/riscv/predicates.md                |   4 +
>   gcc/config/riscv/riscv-opts.h                 |   3 +
>   gcc/config/riscv/riscv-protos.h               |   1 +
>   gcc/config/riscv/riscv.cc                     | 168 +++++++++++++++++-
>   gcc/config/riscv/riscv.h                      |   1 +
>   gcc/config/riscv/riscv.md                     | 112 +++++++++---
>   .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
>   .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
>   gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 ++++++++
>   .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 +++++
>   gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 +++++
>   gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 ++++++++
>   .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 ++
>   .../gcc.target/riscv/zfa-fround-rv32.c        |  42 +++++
>   gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 +++++
>   18 files changed, 654 insertions(+), 23 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
> 

> diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> index 5b70ab20758..ef9b1aa9ed3 100644
> --- a/gcc/config/riscv/iterators.md
> +++ b/gcc/config/riscv/iterators.md
> @@ -284,3 +284,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
>   (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
>   (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
>   
> +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
> +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
> +				(UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
> +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
> +			   (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
> \ No newline at end of file
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index 0d9d7701c7e..b16b8d438e3 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -149,6 +149,10 @@ (define_predicate "move_operand"
>       case CONST_POLY_INT:
>         return known_eq (rtx_to_poly_int64 (op), BYTES_PER_RISCV_VECTOR);
>   
> +    case CONST_DOUBLE:
> +      return const_0_operand (op, mode)
> +	      || (riscv_float_const_rtx_index_for_fli (op) != -1);
Formatting nit.  When you have a line break like this, go ahead and wrap 
the whole expression with parens and put the operator one space past the 
open paren.  Like this:

> 
>     case CONST_DOUBLE
>       return (const_0_operand (op, mode)
>               || riscv_float_const_rtx_index_for_fli (op) != -1);


I used spaces so that mailers don't muck up the formating.  Please make 
sure to use tabs insted of 8 spaces if the indent is >= 8.


> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index c91fa3101aa..c81a2bf44f5 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -799,6 +799,108 @@ static int riscv_symbol_insns (enum riscv_symbol_type type)
>       }
>   }
>   
> +/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
> +   Manual draft. For details, please see:
> +   https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20221217-cb3b9d1 */
> +
> +unsigned HOST_WIDE_INT fli_value_hf[32] =
> +{
> +  0xbc00, 0x400, 0x100, 0x200, 0x1c00, 0x2000, 0x2c00, 0x3000,
> +  0x3400, 0x3500, 0x3600, 0x3700, 0x3800, 0x3900, 0x3a00, 0x3b00,
> +  0x3c00, 0x3d00, 0x3e00, 0x3f00, 0x4000, 0x4100, 0x4200, 0x4400,
> +  0x4800, 0x4c00, 0x5800, 0x5c00, 0x7800,
> +  /* Only used for filling, ensuring that 29 and 30 of HF are the same. */
> +  0x7800,
> +  0x7c00, 0x7e00,
> +};
> +
> +unsigned HOST_WIDE_INT fli_value_sf[32] =
> +{
> +  0xbf800000, 0x00800000, 0x37800000, 0x38000000, 0x3b800000, 0x3c000000, 0x3d800000, 0x3e000000,
> +  0x3e800000, 0x3ea00000, 0x3ec00000, 0x3ee00000, 0x3f000000, 0x3f200000, 0x3f400000, 0x3f600000,
> +  0x3f800000, 0x3fa00000, 0x3fc00000, 0x3fe00000, 0x40000000, 0x40200000, 0x40400000, 0x40800000,
> +  0x41000000, 0x41800000, 0x43000000, 0x43800000, 0x47000000, 0x47800000, 0x7f800000, 0x7fc00000
> +};
> +
> +unsigned HOST_WIDE_INT fli_value_df[32] =
> +{
> +  0xbff0000000000000, 0x10000000000000, 0x3ef0000000000000, 0x3f00000000000000,
> +  0x3f70000000000000, 0x3f80000000000000, 0x3fb0000000000000, 0x3fc0000000000000,
> +  0x3fd0000000000000, 0x3fd4000000000000, 0x3fd8000000000000, 0x3fdc000000000000,
> +  0x3fe0000000000000, 0x3fe4000000000000, 0x3fe8000000000000, 0x3fec000000000000,
> +  0x3ff0000000000000, 0x3ff4000000000000, 0x3ff8000000000000, 0x3ffc000000000000,
> +  0x4000000000000000, 0x4004000000000000, 0x4008000000000000, 0x4010000000000000,
> +  0x4020000000000000, 0x4030000000000000, 0x4060000000000000, 0x4070000000000000,
> +  0x40e0000000000000, 0x40f0000000000000, 0x7ff0000000000000, 0x7ff8000000000000,
> +};
Would it make more sense to use the newer floating point literal 
formats?  I like them marginally more than hext constants because many 
tools can consume that floating point literal syntax trivially.


> +
> +/* Find the index of TARGET in ARRAY, and return -1 if not found. */
"Return the index of TARGET in ARRAY, return -1 if not found.  */


> +
> +/* Return index of the FLI instruction table if rtx X is an immediate constant that
> +   can be moved using a single FLI instruction in zfa extension. -1 otherwise. */
> +
> +int
> +riscv_float_const_rtx_index_for_fli (rtx x)
> +{
> +  machine_mode mode = GET_MODE (x);
> +
> +  if (!TARGET_ZFA || mode == VOIDmode
> +      || !CONST_DOUBLE_P(x)
> +      || (mode == HFmode && !TARGET_ZFH)
> +      || (mode == SFmode && !TARGET_HARD_FLOAT)
> +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
> +    return -1;
Bring the "|| mode == VOIDmode" down on its own line.


> @@ -1191,6 +1296,8 @@ riscv_const_insns (rtx x)
>         }
>   
>       case CONST_DOUBLE:
> +      if (riscv_float_const_rtx_index_for_fli (x) != -1)
> +	return 4;
Should this be returning 1?  That's the case when we find the constant 
in the table and have the FLI instruction.


>     /* Allow FPR <-> FPR and FPR <-> MEM moves, and permit the special case
>        of zeroing an FPR with FCVT.D.W.  */
>     if (TARGET_DOUBLE_FLOAT
>         && ((FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
>   	  || (FP_REG_RTX_P (dest) && MEM_P (src))
>   	  || (FP_REG_RTX_P (src) && MEM_P (dest))
> +	  || (TARGET_ZFA
> +	      && ((FP_REG_RTX_P (dest) && GP_REG_RTX_P (src))
> +	      || (FP_REG_RTX_P (src) && GP_REG_RTX_P (dest))))
>   	  || (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))))
>       return false;

The formatting here seems off.  In particular the last || line you added 
seems like it might need to be indented further.  It's also possible the 
mailer has mucked up.  So double check and fix is it needs adjustment.


In the .md file changes, several of the patterns use UNSPECs.  If 
possible we should try to avoid those when there are reasonable mappings 
to GCC's RTL codes.  Can you please double-check to see if any of the 
cases you're adding can be properly handled by the existing RTL codes?

Overall it looks pretty close.

Jeff