From: Andrew Carlotti <andrew.carlotti@arm.com>
To: Victor Do Nascimento <victor.donascimento@arm.com>
Cc: binutils@sourceware.org, richard.earnshaw@arm.com, nickc@redhat.com
Subject: Re: [PATCH 2/4] aarch64: fp8 convert and scale - Add advsimd insn variants
Date: Mon, 20 May 2024 16:36:09 +0100 [thread overview]
Message-ID: <734c54de-1d64-d2a2-c11e-f5aa67193623@e124511.cambridge.arm.com> (raw)
In-Reply-To: <20240410152950.1134020-3-victor.donascimento@arm.com>
On Wed, Apr 10, 2024 at 04:29:48PM +0100, Victor Do Nascimento wrote:
> Add the advanced SIMD variant of the FP8 convert and scale
> instructions, enabled at assembly-time using the `+fp8'
> architectural extension flag. More specifically, support is
> added for the following instructions:
>
...
> diff --git a/gas/testsuite/gas/aarch64/advsimd-fp8.s b/gas/testsuite/gas/aarch64/advsimd-fp8.s
> new file mode 100644
> index 00000000000..e49f38d420a
> --- /dev/null
> +++ b/gas/testsuite/gas/aarch64/advsimd-fp8.s
> @@ -0,0 +1,76 @@
> + /* advsimd-fp8.s Test file for AArch64 8-bit floating-point vector
> + instructions. */
> +
> + /* Instructions convert the elements from the lower half of the source
> + vector while scaling the values by 2^-UInt(FPMR.LSCALE{2}[3:0]). */
> +
> + .macro cvrt_lowerhalf, op
> + \op v0.8h, v0.8b
> + \op v1.8h, v0.8b
> + \op v0.8h, v1.8b
> + \op v1.8h, v1.8b
> + \op v16.8h, v17.8b
> + .endm
> +
> + cvrt_lowerhalf bf1cvtl
> + cvrt_lowerhalf bf2cvtl
> + cvrt_lowerhalf f1cvtl
> + cvrt_lowerhalf f2cvtl
> +
> + /* Instructions convert the elements from the upper half of the source
> + vector while scaling the values by 2^-UInt(FPMR.LSCALE{2}[3:0]). */
> +
> + .macro cvrt_upperhalf, op
> + \op v0.8h, v0.16b
> + \op v1.8h, v0.16b
> + \op v0.8h, v1.16b
> + \op v1.8h, v1.16b
> + \op v16.8h, v17.16b
> + .endm
> +
> + cvrt_upperhalf bf1cvtl2
> + cvrt_upperhalf bf2cvtl2
> + cvrt_upperhalf f1cvtl2
> + cvrt_upperhalf f2cvtl2
> +
> + /* Floating-point adjust exponent by vector. */
> +
> + .macro fscale_gen, op_var
> + fscale v0.\op_var, v0.\op_var, v0.\op_var
> + fscale v1.\op_var, v0.\op_var, v0.\op_var
> + fscale v0.\op_var, v1.\op_var, v0.\op_var
> + fscale v0.\op_var, v0.\op_var, v1.\op_var
> + fscale v1.\op_var, v1.\op_var, v0.\op_var
> + fscale v0.\op_var, v1.\op_var, v1.\op_var
> + fscale v1.\op_var, v1.\op_var, v1.\op_var
> + fscale v16.\op_var, v17.\op_var, v18.\op_var
> + .endm
> +
> + /* Half-precision variant. */
> + fscale_gen 4h
> + fscale_gen 8h
> + /* Single-precision variant. */
> + fscale_gen 2s
> + fscale_gen 4s
> + fscale_gen 2d
> +
> + /* Half and single-precision to FP8 convert and narrow. */
> +
> + .macro fcvtn_to_fp8, op, sd, ss
> + \op v0.\sd, v0.\ss, v0.\ss
> + \op v1.\sd, v0.\ss, v0.\ss
> + \op v0.\sd, v1.\ss, v0.\ss
> + \op v0.\sd, v0.\ss, v1.\ss
> + \op v1.\sd, v1.\ss, v0.\ss
> + \op v0.\sd, v1.\ss, v1.\ss
> + \op v1.\sd, v1.\ss, v1.\ss
> + \op v16.\sd, v17.\ss, v18.\ss
> + .endm
> +
> + /* Half-precision variant. */
> + fcvtn_to_fp8 fcvtn 8b, 4h
> + fcvtn_to_fp8 fcvtn 16b, 8h
> +
> + /* Single-precision variant. */
> + fcvtn_to_fp8 fcvtn, 8b, 4s
> + fcvtn_to_fp8 fcvtn2, 16b, 4s
Some register operand bits always take the value 0 in these tests, so they don't
show that the opcode mask is correct at those locations. It would be better to
ensure that each operand bit is set to both 0 and 1 in some test. A good way
to do this in general is to have tests that use register 31 (or the highest
valid register) for each operand in turn, while leaving the other registers set
to 0.
> diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
> index 7e603462a37..f876c1b342f 100644
> --- a/opcodes/aarch64-tbl.h
> +++ b/opcodes/aarch64-tbl.h
> @@ -2368,6 +2368,34 @@
> QLF3(X,X,NIL), \
> }
>
> +#define QL_V3_BSS_LOWER \
> +{ \
> + QLF3(V_8B, V_4S, V_4S), \
> +}
> +
> +#define QL_V3_BSS_FULL \
> +{ \
> + QLF3(V_16B, V_4S, V_4S), \
> +}
> +
> +#define QL_V3_BHH \
> +{ \
> + QLF3(V_8B, V_4H, V_4H), \
> + QLF3(V_16B, V_8H, V_8H), \
> +}
> +
> +/* e.g. BF1CVTL <Vd>.8H, <Vn>.8B. */
> +#define QL_V2FP8B8H \
> +{ \
> + QLF2(V_8H, V_8B), \
> +}
How about aligning names with existing qualifier sets - e.g. QL_V2LONGB{2} to
match QL_V2LONGHS{2}? Or, alternatively, QL_V2_HB_{LOWER|FULL} to match your
other new qualifiers?
> +/* e.g. BF1CVTL2 <Vd>.8H, <Vn>.16B. */
> +#define QL_V28H16B \
> +{ \
> + QLF2(V_8H, V_16B), \
> +}
> +
> /* e.g. UDOT <Vd>.2S, <Vn>.8B, <Vm>.8B. */
> #define QL_V3DOT \
> { \
> @@ -6459,6 +6487,19 @@ const struct aarch64_opcode aarch64_opcode_table[] =
> SVE2p1_INSNC("st2q",0xe4600000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt2, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
> SVE2p1_INSNC("st3q",0xe4a00000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt3, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
> SVE2p1_INSNC("st4q",0xe4e00000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt4, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
> + FP8_INSN("bf1cvtl", 0x2ea17800, 0xfffffc00, asimdmisc, OP2 (Vd, Vn), QL_V2FP8B8H, 0),
> + FP8_INSN("bf1cvtl2", 0x6ea17800, 0xfffffc00, asimdmisc, OP2 (Vd, Vn), QL_V28H16B, 0),
> + FP8_INSN("bf2cvtl", 0x2ee17800, 0xfffffc00, asimdmisc, OP2 (Vd, Vn), QL_V2FP8B8H, 0),
> + FP8_INSN("bf2cvtl2", 0x6ee17800, 0xfffffc00, asimdmisc, OP2 (Vd, Vn), QL_V28H16B, 0),
> + FP8_INSN("f1cvtl", 0x2e217800, 0xfffffc00, asimdmisc, OP2 (Vd, Vn), QL_V2FP8B8H, 0),
> + FP8_INSN("f1cvtl2", 0x6e217800, 0xfffffc00, asimdmisc, OP2 (Vd, Vn), QL_V28H16B, 0),
> + FP8_INSN("f2cvtl", 0x2e617800, 0xfffffc00, asimdmisc, OP2 (Vd, Vn), QL_V2FP8B8H, 0),
> + FP8_INSN("f2cvtl2", 0x6e617800, 0xfffffc00, asimdmisc, OP2 (Vd, Vn), QL_V28H16B, 0),
> + FP8_INSN("fcvtn", 0xe00f400, 0xffe0fc00, asimdmisc, OP3 (Vd, Vn, Vm), QL_V3_BSS_LOWER, 0),
> + FP8_INSN("fcvtn2", 0x4e00f400, 0xffe0fc00, asimdmisc, OP3 (Vd, Vn, Vm), QL_V3_BSS_FULL, 0),
> + FP8_INSN("fcvtn", 0xe40f400, 0xbfe0fc00, asimdmisc, OP3 (Vd, Vn, Vm), QL_V3_BHH, F_SIZEQ),
> + FP8_INSN("fscale", 0x2ec03c00, 0xbfe0fc00, asimdmisc, OP3 (Vd, Vn, Vm), QL_VSHIFT_H, F_SIZEQ),
> + FP8_INSN("fscale", 0x2ea0fc00, 0xbfa0fc00, asimdmisc, OP3 (Vd, Vn, Vm), QL_V3SAMESD, F_SIZEQ),
>
> /* Checked Pointer Arithmetic Instructions. */
> CPA_INSN ("addpt", 0x9a002000, 0xffe0e000, aarch64_misc, OP3 (Rd_SP, Rn_SP, Rm_LSL), QL_I3SAMEX),
> --
> 2.34.1
>
next prev parent reply other threads:[~2024-05-20 15:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-10 15:29 [PATCH 0/4] aarch64: Add armv9.5-a FP8 datatype conversion Victor Do Nascimento
2024-04-10 15:29 ` [PATCH 1/4] aarch64: fp8 convert and scale - add feature flags and related structures Victor Do Nascimento
2024-04-10 15:29 ` [PATCH 2/4] aarch64: fp8 convert and scale - Add advsimd insn variants Victor Do Nascimento
2024-05-17 15:43 ` Richard Earnshaw (lists)
2024-05-20 15:36 ` Andrew Carlotti [this message]
2024-04-10 15:29 ` [PATCH 3/4] aarch64: fp8 convert and scale - add sve2 " Victor Do Nascimento
2024-04-10 15:29 ` [PATCH 4/4] aarch64: fp8 convert and scale - add sme2 " Victor Do Nascimento
2024-04-17 9:50 ` [PATCH 0/4] aarch64: Add armv9.5-a FP8 datatype conversion Nick Clifton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=734c54de-1d64-d2a2-c11e-f5aa67193623@e124511.cambridge.arm.com \
--to=andrew.carlotti@arm.com \
--cc=binutils@sourceware.org \
--cc=nickc@redhat.com \
--cc=richard.earnshaw@arm.com \
--cc=victor.donascimento@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).