public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Victor Do Nascimento <victor.donascimento@arm.com>
To: <binutils@sourceware.org>
Cc: <richard.earnshaw@arm.com>, <nickc@redhat.com>,
	Victor Do Nascimento <vicdon01@e133397.arm.com>
Subject: [PATCH 3/4] aarch64: fp8 convert and scale - add sve2 insn variants
Date: Wed, 10 Apr 2024 16:29:49 +0100	[thread overview]
Message-ID: <20240410152950.1134020-4-victor.donascimento@arm.com> (raw)
In-Reply-To: <20240410152950.1134020-1-victor.donascimento@arm.com>

From: Victor Do Nascimento <vicdon01@e133397.arm.com>

Add the SVE2 variant of the FP8 convert and scale instructions,
enabled at assembly-time using the `+sve2+fp8' architectural
extension flag.  More specifically, support is added for the
following instructions:

FP8 convert to BFloat16 (bottom/top):
-------------------------------------

  - bf1cvt Z<d>.H, Z<n>.B
  - bf2cvt Z<d>.H, Z<n>.B
  - bf1cvtlt Z<d>.H, Z<n>.B
  - bf2cvtlt Z<d>.H, Z<n>.B

FP8 convert to half-precision (bottom/top):
-------------------------------------------

  - f1cvt Z<d>.H, Z<n>.B
  - f2cvt Z<d>.H, Z<n>.B
  - f1cvtlt Z<d>.H, Z<n>.B
  - f2cvtlt Z<d>.H, Z<n>.B

BFloat16/half-precision convert, narrow and
interleave to FP8:
-------------------------------------------

  - bfcvtn Z<d>.B, { Z<n>1.H - Z<n>2.H }
  - fcvtn Z<d>.B, { Z<n>1.H - Z<n>2.H }

Single-precision convert, narrow and interleave
to FP8 (bottom/top):
-----------------------------------------------

  - fcvtnb Z<d>.B, { Z<n>1.S - Z<n>2.S }
  - fcvtnt Z<d>.B, { Z<n>1.S - Z<n>2.S }
---
 .../gas/aarch64/sme2-fp8-streaming.d          |   4 +
 gas/testsuite/gas/aarch64/sve2-fp8-dump       |  53 +++++
 gas/testsuite/gas/aarch64/sve2-fp8-fail.d     |   2 +
 gas/testsuite/gas/aarch64/sve2-fp8-fail.l     | 161 +++++++++++++++
 gas/testsuite/gas/aarch64/sve2-fp8-fail.s     |  42 ++++
 gas/testsuite/gas/aarch64/sve2-fp8.d          |   3 +
 gas/testsuite/gas/aarch64/sve2-fp8.s          |  48 +++++
 opcodes/aarch64-dis-2.c                       | 194 +++++++++++++++---
 opcodes/aarch64-tbl.h                         |  20 ++
 9 files changed, 496 insertions(+), 31 deletions(-)
 create mode 100644 gas/testsuite/gas/aarch64/sme2-fp8-streaming.d
 create mode 100644 gas/testsuite/gas/aarch64/sve2-fp8-dump
 create mode 100644 gas/testsuite/gas/aarch64/sve2-fp8-fail.d
 create mode 100644 gas/testsuite/gas/aarch64/sve2-fp8-fail.l
 create mode 100644 gas/testsuite/gas/aarch64/sve2-fp8-fail.s
 create mode 100644 gas/testsuite/gas/aarch64/sve2-fp8.d
 create mode 100644 gas/testsuite/gas/aarch64/sve2-fp8.s

diff --git a/gas/testsuite/gas/aarch64/sme2-fp8-streaming.d b/gas/testsuite/gas/aarch64/sme2-fp8-streaming.d
new file mode 100644
index 00000000000..16ed6b88bcd
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sme2-fp8-streaming.d
@@ -0,0 +1,4 @@
+#as: -march=armv8.5-a+fp8+sme2
+#objdump: -dr
+#source: sve2-fp8.s
+#dump: sve2-fp8-dump
diff --git a/gas/testsuite/gas/aarch64/sve2-fp8-dump b/gas/testsuite/gas/aarch64/sve2-fp8-dump
new file mode 100644
index 00000000000..570ff9c4da4
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2-fp8-dump
@@ -0,0 +1,53 @@
+.*:     file format .*
+
+Disassembly of section \.text:
+
+0+ <.*>:
+[ ]*[0-9a-f]+:	65083800 	bf1cvt	z0.h, z0.b
+[ ]*[0-9a-f]+:	65083801 	bf1cvt	z1.h, z0.b
+[ ]*[0-9a-f]+:	65083820 	bf1cvt	z0.h, z1.b
+[ ]*[0-9a-f]+:	65083bfe 	bf1cvt	z30.h, z31.b
+[ ]*[0-9a-f]+:	65083c00 	bf2cvt	z0.h, z0.b
+[ ]*[0-9a-f]+:	65083c01 	bf2cvt	z1.h, z0.b
+[ ]*[0-9a-f]+:	65083c20 	bf2cvt	z0.h, z1.b
+[ ]*[0-9a-f]+:	65083ffe 	bf2cvt	z30.h, z31.b
+[ ]*[0-9a-f]+:	65093800 	bf1cvtlt	z0.h, z0.b
+[ ]*[0-9a-f]+:	65093801 	bf1cvtlt	z1.h, z0.b
+[ ]*[0-9a-f]+:	65093820 	bf1cvtlt	z0.h, z1.b
+[ ]*[0-9a-f]+:	65093bfe 	bf1cvtlt	z30.h, z31.b
+[ ]*[0-9a-f]+:	65093c00 	bf2cvtlt	z0.h, z0.b
+[ ]*[0-9a-f]+:	65093c01 	bf2cvtlt	z1.h, z0.b
+[ ]*[0-9a-f]+:	65093c20 	bf2cvtlt	z0.h, z1.b
+[ ]*[0-9a-f]+:	65093ffe 	bf2cvtlt	z30.h, z31.b
+[ ]*[0-9a-f]+:	65083000 	f1cvt	z0.h, z0.b
+[ ]*[0-9a-f]+:	65083001 	f1cvt	z1.h, z0.b
+[ ]*[0-9a-f]+:	65083020 	f1cvt	z0.h, z1.b
+[ ]*[0-9a-f]+:	650833fe 	f1cvt	z30.h, z31.b
+[ ]*[0-9a-f]+:	65083400 	f2cvt	z0.h, z0.b
+[ ]*[0-9a-f]+:	65083401 	f2cvt	z1.h, z0.b
+[ ]*[0-9a-f]+:	65083420 	f2cvt	z0.h, z1.b
+[ ]*[0-9a-f]+:	650837fe 	f2cvt	z30.h, z31.b
+[ ]*[0-9a-f]+:	65093000 	f1cvtlt	z0.h, z0.b
+[ ]*[0-9a-f]+:	65093001 	f1cvtlt	z1.h, z0.b
+[ ]*[0-9a-f]+:	65093020 	f1cvtlt	z0.h, z1.b
+[ ]*[0-9a-f]+:	650933fe 	f1cvtlt	z30.h, z31.b
+[ ]*[0-9a-f]+:	65093400 	f2cvtlt	z0.h, z0.b
+[ ]*[0-9a-f]+:	65093401 	f2cvtlt	z1.h, z0.b
+[ ]*[0-9a-f]+:	65093420 	f2cvtlt	z0.h, z1.b
+[ ]*[0-9a-f]+:	650937fe 	f2cvtlt	z30.h, z31.b
+[ ]*[0-9a-f]+:	650a3800 	bfcvtn	z0.b, {z0.h-z1.h}
+[ ]*[0-9a-f]+:	650a3801 	bfcvtn	z1.b, {z0.h-z1.h}
+[ ]*[0-9a-f]+:	650a3840 	bfcvtn	z0.b, {z2.h-z3.h}
+[ ]*[0-9a-f]+:	650a3bdd 	bfcvtn	z29.b, {z30.h-z31.h}
+[ ]*[0-9a-f]+:	650a3000 	fcvtn	z0.b, {z0.h-z1.h}
+[ ]*[0-9a-f]+:	650a3001 	fcvtn	z1.b, {z0.h-z1.h}
+[ ]*[0-9a-f]+:	650a3040 	fcvtn	z0.b, {z2.h-z3.h}
+[ ]*[0-9a-f]+:	650a33dd 	fcvtn	z29.b, {z30.h-z31.h}
+[ ]*[0-9a-f]+:	650a3400 	fcvtnb	z0.b, {z0.s-z1.s}
+[ ]*[0-9a-f]+:	650a3401 	fcvtnb	z1.b, {z0.s-z1.s}
+[ ]*[0-9a-f]+:	650a3440 	fcvtnb	z0.b, {z2.s-z3.s}
+[ ]*[0-9a-f]+:	650a37dd 	fcvtnb	z29.b, {z30.s-z31.s}
+[ ]*[0-9a-f]+:	650a3c00 	fcvtnt	z0.b, {z0.s-z1.s}
+[ ]*[0-9a-f]+:	650a3c01 	fcvtnt	z1.b, {z0.s-z1.s}
+[ ]*[0-9a-f]+:	650a3c40 	fcvtnt	z0.b, {z2.s-z3.s}
+[ ]*[0-9a-f]+:	650a3fdd 	fcvtnt	z29.b, {z30.s-z31.s}
\ No newline at end of file
diff --git a/gas/testsuite/gas/aarch64/sve2-fp8-fail.d b/gas/testsuite/gas/aarch64/sve2-fp8-fail.d
new file mode 100644
index 00000000000..f20d457b5ae
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2-fp8-fail.d
@@ -0,0 +1,2 @@
+#as: -march=armv8.5-a+fp8+sve2 -mno-verbose-error
+#error_output: sve2-fp8-fail.l
diff --git a/gas/testsuite/gas/aarch64/sve2-fp8-fail.l b/gas/testsuite/gas/aarch64/sve2-fp8-fail.l
new file mode 100644
index 00000000000..ab48ff464d7
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2-fp8-fail.l
@@ -0,0 +1,161 @@
+[^:]+: Assembler messages:
+[^:]+:6: Error: operand mismatch -- `bf1cvt z0.b,z1.b'
+[^:]+:30:  Info: macro invoked from here
+[^:]+:8: Error: operand mismatch -- `bf1cvt z0.s,z1.b'
+[^:]+:30:  Info: macro invoked from here
+[^:]+:9: Error: operand mismatch -- `bf1cvt z0.d,z1.b'
+[^:]+:30:  Info: macro invoked from here
+[^:]+:12: Error: operand mismatch -- `bf1cvt z0.h,z1.h'
+[^:]+:30:  Info: macro invoked from here
+[^:]+:13: Error: operand mismatch -- `bf1cvt z0.h,z1.s'
+[^:]+:30:  Info: macro invoked from here
+[^:]+:14: Error: operand mismatch -- `bf1cvt z0.h,z1.d'
+[^:]+:30:  Info: macro invoked from here
+[^:]+:16: Error: expected an SVE vector register at operand 2 -- `bf1cvt z0.h,p0,z1.d'
+[^:]+:30:  Info: macro invoked from here
+[^:]+:17: Error: expected an SVE vector register at operand 2 -- `bf1cvt z0.h,p0/z,z1.d'
+[^:]+:30:  Info: macro invoked from here
+[^:]+:6: Error: operand mismatch -- `bf2cvt z0.b,z1.b'
+[^:]+:31:  Info: macro invoked from here
+[^:]+:8: Error: operand mismatch -- `bf2cvt z0.s,z1.b'
+[^:]+:31:  Info: macro invoked from here
+[^:]+:9: Error: operand mismatch -- `bf2cvt z0.d,z1.b'
+[^:]+:31:  Info: macro invoked from here
+[^:]+:12: Error: operand mismatch -- `bf2cvt z0.h,z1.h'
+[^:]+:31:  Info: macro invoked from here
+[^:]+:13: Error: operand mismatch -- `bf2cvt z0.h,z1.s'
+[^:]+:31:  Info: macro invoked from here
+[^:]+:14: Error: operand mismatch -- `bf2cvt z0.h,z1.d'
+[^:]+:31:  Info: macro invoked from here
+[^:]+:16: Error: expected an SVE vector register at operand 2 -- `bf2cvt z0.h,p0,z1.d'
+[^:]+:31:  Info: macro invoked from here
+[^:]+:17: Error: expected an SVE vector register at operand 2 -- `bf2cvt z0.h,p0/z,z1.d'
+[^:]+:31:  Info: macro invoked from here
+[^:]+:6: Error: operand mismatch -- `bf1cvtlt z0.b,z1.b'
+[^:]+:32:  Info: macro invoked from here
+[^:]+:8: Error: operand mismatch -- `bf1cvtlt z0.s,z1.b'
+[^:]+:32:  Info: macro invoked from here
+[^:]+:9: Error: operand mismatch -- `bf1cvtlt z0.d,z1.b'
+[^:]+:32:  Info: macro invoked from here
+[^:]+:12: Error: operand mismatch -- `bf1cvtlt z0.h,z1.h'
+[^:]+:32:  Info: macro invoked from here
+[^:]+:13: Error: operand mismatch -- `bf1cvtlt z0.h,z1.s'
+[^:]+:32:  Info: macro invoked from here
+[^:]+:14: Error: operand mismatch -- `bf1cvtlt z0.h,z1.d'
+[^:]+:32:  Info: macro invoked from here
+[^:]+:16: Error: expected an SVE vector register at operand 2 -- `bf1cvtlt z0.h,p0,z1.d'
+[^:]+:32:  Info: macro invoked from here
+[^:]+:17: Error: expected an SVE vector register at operand 2 -- `bf1cvtlt z0.h,p0/z,z1.d'
+[^:]+:32:  Info: macro invoked from here
+[^:]+:6: Error: operand mismatch -- `bf2cvtlt z0.b,z1.b'
+[^:]+:33:  Info: macro invoked from here
+[^:]+:8: Error: operand mismatch -- `bf2cvtlt z0.s,z1.b'
+[^:]+:33:  Info: macro invoked from here
+[^:]+:9: Error: operand mismatch -- `bf2cvtlt z0.d,z1.b'
+[^:]+:33:  Info: macro invoked from here
+[^:]+:12: Error: operand mismatch -- `bf2cvtlt z0.h,z1.h'
+[^:]+:33:  Info: macro invoked from here
+[^:]+:13: Error: operand mismatch -- `bf2cvtlt z0.h,z1.s'
+[^:]+:33:  Info: macro invoked from here
+[^:]+:14: Error: operand mismatch -- `bf2cvtlt z0.h,z1.d'
+[^:]+:33:  Info: macro invoked from here
+[^:]+:16: Error: expected an SVE vector register at operand 2 -- `bf2cvtlt z0.h,p0,z1.d'
+[^:]+:33:  Info: macro invoked from here
+[^:]+:17: Error: expected an SVE vector register at operand 2 -- `bf2cvtlt z0.h,p0/z,z1.d'
+[^:]+:33:  Info: macro invoked from here
+[^:]+:6: Error: operand mismatch -- `f1cvt z0.b,z1.b'
+[^:]+:34:  Info: macro invoked from here
+[^:]+:8: Error: operand mismatch -- `f1cvt z0.s,z1.b'
+[^:]+:34:  Info: macro invoked from here
+[^:]+:9: Error: operand mismatch -- `f1cvt z0.d,z1.b'
+[^:]+:34:  Info: macro invoked from here
+[^:]+:12: Error: operand mismatch -- `f1cvt z0.h,z1.h'
+[^:]+:34:  Info: macro invoked from here
+[^:]+:13: Error: operand mismatch -- `f1cvt z0.h,z1.s'
+[^:]+:34:  Info: macro invoked from here
+[^:]+:14: Error: operand mismatch -- `f1cvt z0.h,z1.d'
+[^:]+:34:  Info: macro invoked from here
+[^:]+:16: Error: expected an SVE vector register at operand 2 -- `f1cvt z0.h,p0,z1.d'
+[^:]+:34:  Info: macro invoked from here
+[^:]+:17: Error: expected an SVE vector register at operand 2 -- `f1cvt z0.h,p0/z,z1.d'
+[^:]+:34:  Info: macro invoked from here
+[^:]+:6: Error: operand mismatch -- `f2cvt z0.b,z1.b'
+[^:]+:35:  Info: macro invoked from here
+[^:]+:8: Error: operand mismatch -- `f2cvt z0.s,z1.b'
+[^:]+:35:  Info: macro invoked from here
+[^:]+:9: Error: operand mismatch -- `f2cvt z0.d,z1.b'
+[^:]+:35:  Info: macro invoked from here
+[^:]+:12: Error: operand mismatch -- `f2cvt z0.h,z1.h'
+[^:]+:35:  Info: macro invoked from here
+[^:]+:13: Error: operand mismatch -- `f2cvt z0.h,z1.s'
+[^:]+:35:  Info: macro invoked from here
+[^:]+:14: Error: operand mismatch -- `f2cvt z0.h,z1.d'
+[^:]+:35:  Info: macro invoked from here
+[^:]+:16: Error: expected an SVE vector register at operand 2 -- `f2cvt z0.h,p0,z1.d'
+[^:]+:35:  Info: macro invoked from here
+[^:]+:17: Error: expected an SVE vector register at operand 2 -- `f2cvt z0.h,p0/z,z1.d'
+[^:]+:35:  Info: macro invoked from here
+[^:]+:6: Error: operand mismatch -- `f1cvtlt z0.b,z1.b'
+[^:]+:36:  Info: macro invoked from here
+[^:]+:8: Error: operand mismatch -- `f1cvtlt z0.s,z1.b'
+[^:]+:36:  Info: macro invoked from here
+[^:]+:9: Error: operand mismatch -- `f1cvtlt z0.d,z1.b'
+[^:]+:36:  Info: macro invoked from here
+[^:]+:12: Error: operand mismatch -- `f1cvtlt z0.h,z1.h'
+[^:]+:36:  Info: macro invoked from here
+[^:]+:13: Error: operand mismatch -- `f1cvtlt z0.h,z1.s'
+[^:]+:36:  Info: macro invoked from here
+[^:]+:14: Error: operand mismatch -- `f1cvtlt z0.h,z1.d'
+[^:]+:36:  Info: macro invoked from here
+[^:]+:16: Error: expected an SVE vector register at operand 2 -- `f1cvtlt z0.h,p0,z1.d'
+[^:]+:36:  Info: macro invoked from here
+[^:]+:17: Error: expected an SVE vector register at operand 2 -- `f1cvtlt z0.h,p0/z,z1.d'
+[^:]+:36:  Info: macro invoked from here
+[^:]+:6: Error: operand mismatch -- `f2cvtlt z0.b,z1.b'
+[^:]+:37:  Info: macro invoked from here
+[^:]+:8: Error: operand mismatch -- `f2cvtlt z0.s,z1.b'
+[^:]+:37:  Info: macro invoked from here
+[^:]+:9: Error: operand mismatch -- `f2cvtlt z0.d,z1.b'
+[^:]+:37:  Info: macro invoked from here
+[^:]+:12: Error: operand mismatch -- `f2cvtlt z0.h,z1.h'
+[^:]+:37:  Info: macro invoked from here
+[^:]+:13: Error: operand mismatch -- `f2cvtlt z0.h,z1.s'
+[^:]+:37:  Info: macro invoked from here
+[^:]+:14: Error: operand mismatch -- `f2cvtlt z0.h,z1.d'
+[^:]+:37:  Info: macro invoked from here
+[^:]+:16: Error: expected an SVE vector register at operand 2 -- `f2cvtlt z0.h,p0,z1.d'
+[^:]+:37:  Info: macro invoked from here
+[^:]+:17: Error: expected an SVE vector register at operand 2 -- `f2cvtlt z0.h,p0/z,z1.d'
+[^:]+:37:  Info: macro invoked from here
+[^:]+:23: Error: operand mismatch -- `bfcvtn z1.h,{z0.h-z1.h}'
+[^:]+:39:  Info: macro invoked from here
+[^:]+:24: Error: operand mismatch -- `bfcvtn z0.s,{z0.h-z1.h}'
+[^:]+:39:  Info: macro invoked from here
+[^:]+:25: Error: operand mismatch -- `bfcvtn z7.d,{z0.h-z1.h}'
+[^:]+:39:  Info: macro invoked from here
+[^:]+:27: Error: start register out of range at operand 2 -- `bfcvtn z0.b,{z1.h-z2.h}'
+[^:]+:39:  Info: macro invoked from here
+[^:]+:23: Error: operand mismatch -- `fcvtn z1.h,{z0.h-z1.h}'
+[^:]+:40:  Info: macro invoked from here
+[^:]+:24: Error: operand mismatch -- `fcvtn z0.s,{z0.h-z1.h}'
+[^:]+:40:  Info: macro invoked from here
+[^:]+:25: Error: operand mismatch -- `fcvtn z7.d,{z0.h-z1.h}'
+[^:]+:40:  Info: macro invoked from here
+[^:]+:27: Error: start register out of range at operand 2 -- `fcvtn z0.b,{z1.h-z2.h}'
+[^:]+:40:  Info: macro invoked from here
+[^:]+:23: Error: operand mismatch -- `fcvtnb z1.h,{z0.s-z1.s}'
+[^:]+:41:  Info: macro invoked from here
+[^:]+:24: Error: operand mismatch -- `fcvtnb z0.s,{z0.s-z1.s}'
+[^:]+:41:  Info: macro invoked from here
+[^:]+:25: Error: operand mismatch -- `fcvtnb z7.d,{z0.s-z1.s}'
+[^:]+:41:  Info: macro invoked from here
+[^:]+:27: Error: start register out of range at operand 2 -- `fcvtnb z0.b,{z1.s-z2.s}'
+[^:]+:41:  Info: macro invoked from here
+[^:]+:23: Error: operand mismatch -- `fcvtnt z1.h,{z0.s-z1.s}'
+[^:]+:42:  Info: macro invoked from here
+[^:]+:24: Error: operand mismatch -- `fcvtnt z0.s,{z0.s-z1.s}'
+[^:]+:42:  Info: macro invoked from here
+[^:]+:25: Error: operand mismatch -- `fcvtnt z7.d,{z0.s-z1.s}'
+[^:]+:42:  Info: macro invoked from here
+[^:]+:27: Error: start register out of range at operand 2 -- `fcvtnt z0.b,{z1.s-z2.s}'
+[^:]+:42:  Info: macro invoked from here
diff --git a/gas/testsuite/gas/aarch64/sve2-fp8-fail.s b/gas/testsuite/gas/aarch64/sve2-fp8-fail.s
new file mode 100644
index 00000000000..057bb626247
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2-fp8-fail.s
@@ -0,0 +1,42 @@
+	/* sve-fp8-fail.s Test file for error-checking AArch64 SVE 8-bit
+	floating-point vector instructions.  */
+
+	.macro cvt_pat1, op
+	/* Check element width qualifier for destination register.  */
+	\op	z0.b, z1.b
+	\op	z0.h, z1.b /* Valid.  */
+	\op	z0.s, z1.b
+	\op	z0.d, z1.b
+	/* Check element width qualifier for source register.  */
+	\op	z0.h, z1.b /* Valid.  */
+	\op	z0.h, z1.h
+	\op	z0.h, z1.s
+	\op	z0.h, z1.d
+	/* Ensure predicate register is not allowed.  */
+	\op	z0.h, p0, z1.d
+	\op	z0.h, p0/z, z1.d
+	.endm
+
+	.macro cvt_pat2, op, width
+	/* Check element width qualifier for destination register.  */
+	\op	z0.b, { z0.\width - z1.\width } /* Valid.  */
+	\op	z1.h, { z0.\width - z1.\width }
+	\op	z0.s, { z0.\width - z1.\width }
+	\op	z7.d, { z0.\width - z1.\width }
+	/* Check whether source register range starts at even register.  */
+	\op	z0.b, { z1.\width - z2.\width }
+	.endm
+
+	cvt_pat1 bf1cvt
+	cvt_pat1 bf2cvt
+	cvt_pat1 bf1cvtlt
+	cvt_pat1 bf2cvtlt
+	cvt_pat1 f1cvt
+	cvt_pat1 f2cvt
+	cvt_pat1 f1cvtlt
+	cvt_pat1 f2cvtlt
+
+	cvt_pat2 bfcvtn, h
+	cvt_pat2 fcvtn, h
+	cvt_pat2 fcvtnb, s
+	cvt_pat2 fcvtnt, s
diff --git a/gas/testsuite/gas/aarch64/sve2-fp8.d b/gas/testsuite/gas/aarch64/sve2-fp8.d
new file mode 100644
index 00000000000..774b8e79d09
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2-fp8.d
@@ -0,0 +1,3 @@
+#as: -march=armv8.5-a+fp8+sve2
+#objdump: -dr
+#dump: sve2-fp8-dump
diff --git a/gas/testsuite/gas/aarch64/sve2-fp8.s b/gas/testsuite/gas/aarch64/sve2-fp8.s
new file mode 100644
index 00000000000..62dee7334cc
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2-fp8.s
@@ -0,0 +1,48 @@
+	/* sve-fp8.s Test file for AArch64 SVE 8-bit floating-point vector
+	instructions.  */
+
+	.macro cvt_pat1, op
+	\op	z0.h, z0.b
+	\op	z1.h, z0.b
+	\op	z0.h, z1.b
+	\op	z30.h, z31.b
+	.endm
+
+	.macro cvt_pat2, op, width
+	\op	z0.b, { z0.\width - z1.\width }
+	\op	z1.b, { z0.\width - z1.\width }
+	\op	z0.b, { z2.\width - z3.\width }
+	\op	z29.b, { z30.\width - z31.\width }
+	.endm
+
+	/* 8-bit floating-point convert to BFloat16 (top/bottom) with scaling by
+	2^-UInt(FPMR.LSCALE{2}[5:0]).  */
+
+	cvt_pat1 bf1cvt
+	cvt_pat1 bf2cvt
+	cvt_pat1 bf1cvtlt
+	cvt_pat1 bf2cvtlt
+
+	/* 8-bit floating-point convert to half-precision (top/bottom) with
+	scaling by 2^-UInt(FPMR.LSCALE{2}[3:0]).  */
+
+	cvt_pat1 f1cvt
+	cvt_pat1 f2cvt
+	cvt_pat1 f1cvtlt
+	cvt_pat1 f2cvtlt
+
+	/* BFloat16 convert, narrow and interleave to 8-bit floating-point
+	with scaling by 2^SInt(FPMR.NSCALE).  */
+
+	cvt_pat2 bfcvtn, h
+
+	/* Half-precision convert, narrow and interleave to 8-bit floating-point
+	with scaling by 2^SInt(FPMR.NSCALE[4:0]).  */
+
+	cvt_pat2 fcvtn, h
+
+	/* Single-precision convert, narrow and interleave to 8-bit
+	floating-point (top/bottom) 2^SInt(FPMR.NSCALE).  */
+
+	cvt_pat2 fcvtnb, s
+	cvt_pat2 fcvtnt, s
diff --git a/opcodes/aarch64-dis-2.c b/opcodes/aarch64-dis-2.c
index 2268bf6983a..36d474403e2 100644
--- a/opcodes/aarch64-dis-2.c
+++ b/opcodes/aarch64-dis-2.c
@@ -10334,7 +10334,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                              10987654321098765432109876543210
                                              x0x11010000xxxxxxx1xxxxxxxxxxxxx
                                              addpt.  */
-                                          return 3346;
+                                          return 3358;
                                         }
                                       else
                                         {
@@ -10342,7 +10342,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                              10987654321098765432109876543210
                                              x1x11010000xxxxxxx1xxxxxxxxxxxxx
                                              subpt.  */
-                                          return 3347;
+                                          return 3359;
                                         }
                                     }
                                 }
@@ -11260,7 +11260,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                  10987654321098765432109876543210
                                  xxxx1011x11xxxxx0xxxxxxxxxxxxxxx
                                  maddpt.  */
-                              return 3348;
+                              return 3360;
                             }
                           else
                             {
@@ -11268,7 +11268,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                  10987654321098765432109876543210
                                  xxxx1011x11xxxxx1xxxxxxxxxxxxxxx
                                  msubpt.  */
-                              return 3349;
+                              return 3361;
                             }
                         }
                     }
@@ -11353,7 +11353,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                                                      10987654321098765432109876543210
                                                                      000001x0xx000100000xxxxxxxxxxxxx
                                                                      addpt.  */
-                                                                  return 3350;
+                                                                  return 3362;
                                                                 }
                                                               else
                                                                 {
@@ -11460,7 +11460,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                                                      10987654321098765432109876543210
                                                                      000001x0xx000101000xxxxxxxxxxxxx
                                                                      subpt.  */
-                                                                  return 3352;
+                                                                  return 3364;
                                                                 }
                                                               else
                                                                 {
@@ -11665,7 +11665,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                                              10987654321098765432109876543210
                                                              000001x0xx1xxxxx000010xxxxxxxxxx
                                                              addpt.  */
-                                                          return 3351;
+                                                          return 3363;
                                                         }
                                                       else
                                                         {
@@ -11706,7 +11706,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                                              10987654321098765432109876543210
                                                              000001x0xx1xxxxx000011xxxxxxxxxx
                                                              subpt.  */
-                                                          return 3353;
+                                                          return 3365;
                                                         }
                                                       else
                                                         {
@@ -13364,7 +13364,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                                              10987654321098765432109876543210
                                                              010001x0xx0xxxxx110100xxxxxxxxxx
                                                              mlapt.  */
-                                                          return 3355;
+                                                          return 3367;
                                                         }
                                                     }
                                                   else
@@ -13394,7 +13394,7 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                                              10987654321098765432109876543210
                                                              010001x0xx0xxxxx110110xxxxxxxxxx
                                                              madpt.  */
-                                                          return 3354;
+                                                          return 3366;
                                                         }
                                                     }
                                                 }
@@ -20817,11 +20817,55 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                                             }
                                                           else
                                                             {
-                                                              /* 33222222222211111111110000000000
-                                                                 10987654321098765432109876543210
-                                                                 011001x1xx0x1000001xxxxxxxxxxxxx
-                                                                 fadda.  */
-                                                              return 1447;
+                                                              if (((word >> 20) & 0x1) == 0)
+                                                                {
+                                                                  if (((word >> 10) & 0x1) == 0)
+                                                                    {
+                                                                      if (((word >> 11) & 0x1) == 0)
+                                                                        {
+                                                                          /* 33222222222211111111110000000000
+                                                                             10987654321098765432109876543210
+                                                                             011001x1xx001000001x00xxxxxxxxxx
+                                                                             f1cvt.  */
+                                                                          return 3350;
+                                                                        }
+                                                                      else
+                                                                        {
+                                                                          /* 33222222222211111111110000000000
+                                                                             10987654321098765432109876543210
+                                                                             011001x1xx001000001x10xxxxxxxxxx
+                                                                             bf1cvt.  */
+                                                                          return 3346;
+                                                                        }
+                                                                    }
+                                                                  else
+                                                                    {
+                                                                      if (((word >> 11) & 0x1) == 0)
+                                                                        {
+                                                                          /* 33222222222211111111110000000000
+                                                                             10987654321098765432109876543210
+                                                                             011001x1xx001000001x01xxxxxxxxxx
+                                                                             f2cvt.  */
+                                                                          return 3351;
+                                                                        }
+                                                                      else
+                                                                        {
+                                                                          /* 33222222222211111111110000000000
+                                                                             10987654321098765432109876543210
+                                                                             011001x1xx001000001x11xxxxxxxxxx
+                                                                             bf2cvt.  */
+                                                                          return 3347;
+                                                                        }
+                                                                    }
+                                                                }
+                                                              else
+                                                                {
+                                                                  /* 33222222222211111111110000000000
+                                                                     10987654321098765432109876543210
+                                                                     011001x1xx011000001xxxxxxxxxxxxx
+                                                                     fadda.  */
+                                                                  return 1447;
+                                                                }
                                                             }
                                                         }
                                                       else
@@ -20837,11 +20881,55 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                                     {
                                                       if (((word >> 18) & 0x1) == 0)
                                                         {
-                                                          /* 33222222222211111111110000000000
-                                                             10987654321098765432109876543210
-                                                             011001x1xx0xx010001xxxxxxxxxxxxx
-                                                             fcmeq.  */
-                                                          return 1453;
+                                                          if (((word >> 19) & 0x1) == 0)
+                                                            {
+                                                              /* 33222222222211111111110000000000
+                                                                 10987654321098765432109876543210
+                                                                 011001x1xx0x0010001xxxxxxxxxxxxx
+                                                                 fcmeq.  */
+                                                              return 1453;
+                                                            }
+                                                          else
+                                                            {
+                                                              if (((word >> 10) & 0x1) == 0)
+                                                                {
+                                                                  if (((word >> 11) & 0x1) == 0)
+                                                                    {
+                                                                      /* 33222222222211111111110000000000
+                                                                         10987654321098765432109876543210
+                                                                         011001x1xx0x1010001x00xxxxxxxxxx
+                                                                         fcvtn.  */
+                                                                      return 3355;
+                                                                    }
+                                                                  else
+                                                                    {
+                                                                      /* 33222222222211111111110000000000
+                                                                         10987654321098765432109876543210
+                                                                         011001x1xx0x1010001x10xxxxxxxxxx
+                                                                         bfcvtn.  */
+                                                                      return 3354;
+                                                                    }
+                                                                }
+                                                              else
+                                                                {
+                                                                  if (((word >> 11) & 0x1) == 0)
+                                                                    {
+                                                                      /* 33222222222211111111110000000000
+                                                                         10987654321098765432109876543210
+                                                                         011001x1xx0x1010001x01xxxxxxxxxx
+                                                                         fcvtnb.  */
+                                                                      return 3356;
+                                                                    }
+                                                                  else
+                                                                    {
+                                                                      /* 33222222222211111111110000000000
+                                                                         10987654321098765432109876543210
+                                                                         011001x1xx0x1010001x11xxxxxxxxxx
+                                                                         fcvtnt.  */
+                                                                      return 3357;
+                                                                    }
+                                                                }
+                                                            }
                                                         }
                                                       else
                                                         {
@@ -20870,21 +20958,65 @@ aarch64_opcode_lookup_1 (uint32_t word)
                                                     {
                                                       if (((word >> 18) & 0x1) == 0)
                                                         {
-                                                          if (((word >> 4) & 0x1) == 0)
+                                                          if (((word >> 19) & 0x1) == 0)
                                                             {
-                                                              /* 33222222222211111111110000000000
-                                                                 10987654321098765432109876543210
-                                                                 011001x1xx0xx001001xxxxxxxx0xxxx
-                                                                 fcmlt.  */
-                                                              return 1460;
+                                                              if (((word >> 4) & 0x1) == 0)
+                                                                {
+                                                                  /* 33222222222211111111110000000000
+                                                                     10987654321098765432109876543210
+                                                                     011001x1xx0x0001001xxxxxxxx0xxxx
+                                                                     fcmlt.  */
+                                                                  return 1460;
+                                                                }
+                                                              else
+                                                                {
+                                                                  /* 33222222222211111111110000000000
+                                                                     10987654321098765432109876543210
+                                                                     011001x1xx0x0001001xxxxxxxx1xxxx
+                                                                     fcmle.  */
+                                                                  return 1459;
+                                                                }
                                                             }
                                                           else
                                                             {
-                                                              /* 33222222222211111111110000000000
-                                                                 10987654321098765432109876543210
-                                                                 011001x1xx0xx001001xxxxxxxx1xxxx
-                                                                 fcmle.  */
-                                                              return 1459;
+                                                              if (((word >> 10) & 0x1) == 0)
+                                                                {
+                                                                  if (((word >> 11) & 0x1) == 0)
+                                                                    {
+                                                                      /* 33222222222211111111110000000000
+                                                                         10987654321098765432109876543210
+                                                                         011001x1xx0x1001001x00xxxxxxxxxx
+                                                                         f1cvtlt.  */
+                                                                      return 3352;
+                                                                    }
+                                                                  else
+                                                                    {
+                                                                      /* 33222222222211111111110000000000
+                                                                         10987654321098765432109876543210
+                                                                         011001x1xx0x1001001x10xxxxxxxxxx
+                                                                         bf1cvtlt.  */
+                                                                      return 3348;
+                                                                    }
+                                                                }
+                                                              else
+                                                                {
+                                                                  if (((word >> 11) & 0x1) == 0)
+                                                                    {
+                                                                      /* 33222222222211111111110000000000
+                                                                         10987654321098765432109876543210
+                                                                         011001x1xx0x1001001x01xxxxxxxxxx
+                                                                         f2cvtlt.  */
+                                                                      return 3353;
+                                                                    }
+                                                                  else
+                                                                    {
+                                                                      /* 33222222222211111111110000000000
+                                                                         10987654321098765432109876543210
+                                                                         011001x1xx0x1001001x11xxxxxxxxxx
+                                                                         bf2cvtlt.  */
+                                                                      return 3349;
+                                                                    }
+                                                                }
                                                             }
                                                         }
                                                       else
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index f876c1b342f..464d9313a37 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1644,6 +1644,14 @@
 {                                                       \
   QLF2(S_H,S_B),                                        \
 }
+#define OP_SVE_BH                                       \
+{                                                       \
+  QLF2(S_B,S_H),                                        \
+}
+#define OP_SVE_BS                                       \
+{                                                       \
+  QLF2(S_B,S_S),                                        \
+}
 #define OP_SVE_HHH                                      \
 {                                                       \
   QLF3(S_H,S_H,S_H),                                    \
@@ -6500,6 +6508,18 @@ const struct aarch64_opcode aarch64_opcode_table[] =
   FP8_INSN("fcvtn", 0xe40f400,  0xbfe0fc00, asimdmisc, OP3 (Vd, Vn, Vm), QL_V3_BHH, F_SIZEQ),
   FP8_INSN("fscale", 0x2ec03c00, 0xbfe0fc00, asimdmisc, OP3 (Vd, Vn, Vm), QL_VSHIFT_H, F_SIZEQ),
   FP8_INSN("fscale", 0x2ea0fc00, 0xbfa0fc00, asimdmisc, OP3 (Vd, Vn, Vm), QL_V3SAMESD, F_SIZEQ),
+  FP8_SVE2_INSN ("bf1cvt", 0x65083800, 0xfffffc00, sve_misc, 0, OP2 (SVE_Zd, SVE_Zn), OP_SVE_HB, 0, 0),
+  FP8_SVE2_INSN ("bf2cvt", 0x65083c00, 0xfffffc00, sve_misc, 0, OP2 (SVE_Zd, SVE_Zn), OP_SVE_HB, 0, 0),
+  FP8_SVE2_INSN ("bf1cvtlt", 0x65093800, 0xfffffc00, sve_misc, 0, OP2 (SVE_Zd, SVE_Zn), OP_SVE_HB, 0, 0),
+  FP8_SVE2_INSN ("bf2cvtlt", 0x65093c00, 0xfffffc00, sve_misc, 0, OP2 (SVE_Zd, SVE_Zn), OP_SVE_HB, 0, 0),
+  FP8_SVE2_INSN ("f1cvt", 0x65083000, 0xfffffc00, sve_misc, 0, OP2 (SVE_Zd, SVE_Zn), OP_SVE_HB, 0, 0),
+  FP8_SVE2_INSN ("f2cvt", 0x65083400, 0xfffffc00, sve_misc, 0, OP2 (SVE_Zd, SVE_Zn), OP_SVE_HB, 0, 0),
+  FP8_SVE2_INSN ("f1cvtlt", 0x65093000, 0xfffffc00, sve_misc, 0, OP2 (SVE_Zd, SVE_Zn), OP_SVE_HB, 0, 0),
+  FP8_SVE2_INSN ("f2cvtlt", 0x65093400, 0xfffffc00, sve_misc, 0, OP2 (SVE_Zd, SVE_Zn), OP_SVE_HB, 0, 0),
+  FP8_SVE2_INSN ("bfcvtn", 0x650a3800, 0xfffffc20, sve_misc, 0, OP2 (SVE_Zd, SME_Znx2), OP_SVE_BH, 0, 0),
+  FP8_SVE2_INSN ("fcvtn", 0x650a3000, 0xfffffc20, sve_misc, 0, OP2 (SVE_Zd, SME_Znx2), OP_SVE_BH, 0, 0),
+  FP8_SVE2_INSN ("fcvtnb", 0x650a3400, 0xfffffc20, sve_misc, 0, OP2 (SVE_Zd, SME_Znx2), OP_SVE_BS, 0, 0),
+  FP8_SVE2_INSN ("fcvtnt", 0x650a3c00, 0xfffffc20, sve_misc, 0, OP2 (SVE_Zd, SME_Znx2), OP_SVE_BS, 0, 0),
 
 /* Checked Pointer Arithmetic Instructions.  */
   CPA_INSN ("addpt",  0x9a002000, 0xffe0e000, aarch64_misc, OP3 (Rd_SP, Rn_SP, Rm_LSL), QL_I3SAMEX),
-- 
2.34.1


  parent reply	other threads:[~2024-04-10 15:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-10 15:29 [PATCH 0/4] aarch64: Add armv9.5-a FP8 datatype conversion Victor Do Nascimento
2024-04-10 15:29 ` [PATCH 1/4] aarch64: fp8 convert and scale - add feature flags and related structures Victor Do Nascimento
2024-04-10 15:29 ` [PATCH 2/4] aarch64: fp8 convert and scale - Add advsimd insn variants Victor Do Nascimento
2024-05-17 15:43   ` Richard Earnshaw (lists)
2024-05-20 15:36   ` Andrew Carlotti
2024-04-10 15:29 ` Victor Do Nascimento [this message]
2024-04-10 15:29 ` [PATCH 4/4] aarch64: fp8 convert and scale - add sme2 " Victor Do Nascimento
2024-04-17  9:50 ` [PATCH 0/4] aarch64: Add armv9.5-a FP8 datatype conversion Nick Clifton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240410152950.1134020-4-victor.donascimento@arm.com \
    --to=victor.donascimento@arm.com \
    --cc=binutils@sourceware.org \
    --cc=nickc@redhat.com \
    --cc=richard.earnshaw@arm.com \
    --cc=vicdon01@e133397.arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).