public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions.
@ 2024-01-15  9:28 Srinath Parvathaneni
  2024-01-15  9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15  9:28 UTC (permalink / raw)
  To: binutils; +Cc: richard.earnshaw, nickc

[-- Attachment #1: Type: text/plain, Size: 388 bytes --]

Hi,

This patch add support for SVE2.1 and SME2.1 non-widening BFloat16
(FEAT_B16B16) instructions.

Following instructions predicated, unpredicated and indexed
variants are added in this patch.

bfadd, bfclamp, bfmax bfmaxnm, bfmin,bfminnm,
bfmla,bfmls,bfmul and bfsub.

Regression testing for aarch64-none-elf target and found no regressions.

Ok for binutils-master?

Regards,
Srinath.

[-- Attachment #2: 1_6.patch --]
[-- Type: text/x-patch, Size: 21502 bytes --]

diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 7eb732adbb6c85fdf4db7c4b14d0be5fafa370b6..bc40d126632e093b02268fd7474f4cf0c6ddf6d7 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -10335,6 +10335,7 @@ static const struct aarch64_option_cpu_value_table aarch64_features[] = {
   {"ite",		AARCH64_FEATURE (ITE), AARCH64_NO_FEATURES},
   {"d128",		AARCH64_FEATURE (D128),
 			AARCH64_FEATURE (LSE128)},
+  {"b16b16",		AARCH64_FEATURE (B16B16), AARCH64_FEATURE (SVE2)},
   {NULL,		AARCH64_NO_FEATURES, AARCH64_NO_FEATURES},
 };
 
diff --git a/gas/testsuite/gas/aarch64/bfloat16-1.d b/gas/testsuite/gas/aarch64/bfloat16-1.d
new file mode 100644
index 0000000000000000000000000000000000000000..f0d436bec585ff2aee2e007d63fc672a11a569b9
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/bfloat16-1.d
@@ -0,0 +1,106 @@
+#name: Test of SVE2.1 and SME2.1 non-widening BFloat16 instructions.
+#as: -march=armv9.4-a+b16b16
+#objdump: -dr
+
+[^:]+:     file format .*
+
+
+[^:]+:
+
+[^:]+:
+.*:	65008200 	bfadd	z0.h, p0\/m, z0.h, z16.h
+.*:	65008501 	bfadd	z1.h, p1\/m, z1.h, z8.h
+.*:	65008882 	bfadd	z2.h, p2\/m, z2.h, z4.h
+.*:	65009044 	bfadd	z4.h, p4\/m, z4.h, z2.h
+.*:	65009828 	bfadd	z8.h, p6\/m, z8.h, z1.h
+.*:	65009c10 	bfadd	z16.h, p7\/m, z16.h, z0.h
+.*:	65068200 	bfmax	z0.h, p0\/m, z0.h, z16.h
+.*:	65068501 	bfmax	z1.h, p1\/m, z1.h, z8.h
+.*:	65068882 	bfmax	z2.h, p2\/m, z2.h, z4.h
+.*:	65069044 	bfmax	z4.h, p4\/m, z4.h, z2.h
+.*:	65069828 	bfmax	z8.h, p6\/m, z8.h, z1.h
+.*:	65069c10 	bfmax	z16.h, p7\/m, z16.h, z0.h
+.*:	65048200 	bfmaxnm	z0.h, p0\/m, z0.h, z16.h
+.*:	65048501 	bfmaxnm	z1.h, p1\/m, z1.h, z8.h
+.*:	65048882 	bfmaxnm	z2.h, p2\/m, z2.h, z4.h
+.*:	65049044 	bfmaxnm	z4.h, p4\/m, z4.h, z2.h
+.*:	65049828 	bfmaxnm	z8.h, p6\/m, z8.h, z1.h
+.*:	65049c10 	bfmaxnm	z16.h, p7\/m, z16.h, z0.h
+.*:	65078200 	bfmin	z0.h, p0\/m, z0.h, z16.h
+.*:	65078501 	bfmin	z1.h, p1\/m, z1.h, z8.h
+.*:	65078882 	bfmin	z2.h, p2\/m, z2.h, z4.h
+.*:	65079044 	bfmin	z4.h, p4\/m, z4.h, z2.h
+.*:	65079828 	bfmin	z8.h, p6\/m, z8.h, z1.h
+.*:	65079c10 	bfmin	z16.h, p7\/m, z16.h, z0.h
+.*:	65058200 	bfminnm	z0.h, p0\/m, z0.h, z16.h
+.*:	65058501 	bfminnm	z1.h, p1\/m, z1.h, z8.h
+.*:	65058882 	bfminnm	z2.h, p2\/m, z2.h, z4.h
+.*:	65059044 	bfminnm	z4.h, p4\/m, z4.h, z2.h
+.*:	65059828 	bfminnm	z8.h, p6\/m, z8.h, z1.h
+.*:	65059c10 	bfminnm	z16.h, p7\/m, z16.h, z0.h
+.*:	65100080 	bfadd	z0.h, z4.h, z16.h
+.*:	65080101 	bfadd	z1.h, z8.h, z8.h
+.*:	65040182 	bfadd	z2.h, z12.h, z4.h
+.*:	65020204 	bfadd	z4.h, z16.h, z2.h
+.*:	65010288 	bfadd	z8.h, z20.h, z1.h
+.*:	65000310 	bfadd	z16.h, z24.h, z0.h
+.*:	64302480 	bfclamp	z0.h, z4.h, z16.h
+.*:	64282501 	bfclamp	z1.h, z8.h, z8.h
+.*:	64242582 	bfclamp	z2.h, z12.h, z4.h
+.*:	64222604 	bfclamp	z4.h, z16.h, z2.h
+.*:	64212688 	bfclamp	z8.h, z20.h, z1.h
+.*:	64202710 	bfclamp	z16.h, z24.h, z0.h
+.*:	65300000 	bfmla	z0.h, p0\/m, z0.h, z16.h
+.*:	65280421 	bfmla	z1.h, p1\/m, z1.h, z8.h
+.*:	65240842 	bfmla	z2.h, p2\/m, z2.h, z4.h
+.*:	65221084 	bfmla	z4.h, p4\/m, z4.h, z2.h
+.*:	65211908 	bfmla	z8.h, p6\/m, z8.h, z1.h
+.*:	65201e10 	bfmla	z16.h, p7\/m, z16.h, z0.h
+.*:	643e0a00 	bfmla	z0.h, z16.h, z6.h\[7\]
+.*:	643d0901 	bfmla	z1.h, z8.h, z5.h\[7\]
+.*:	643409c2 	bfmla	z2.h, z14.h, z4.h\[5\]
+.*:	642a0aa4 	bfmla	z4.h, z21.h, z2.h\[3\]
+.*:	64210988 	bfmla	z8.h, z12.h, z1.h\[1\]
+.*:	64200950 	bfmla	z16.h, z10.h, z0.h\[1\]
+.*:	65302000 	bfmls	z0.h, p0\/m, z0.h, z16.h
+.*:	65282421 	bfmls	z1.h, p1\/m, z1.h, z8.h
+.*:	65242842 	bfmls	z2.h, p2\/m, z2.h, z4.h
+.*:	65223084 	bfmls	z4.h, p4\/m, z4.h, z2.h
+.*:	65213908 	bfmls	z8.h, p6\/m, z8.h, z1.h
+.*:	65203e10 	bfmls	z16.h, p7\/m, z16.h, z0.h
+.*:	643e0e00 	bfmls	z0.h, z16.h, z6.h\[7\]
+.*:	643d0d01 	bfmls	z1.h, z8.h, z5.h\[7\]
+.*:	64340dc2 	bfmls	z2.h, z14.h, z4.h\[5\]
+.*:	642a0ea4 	bfmls	z4.h, z21.h, z2.h\[3\]
+.*:	64210d88 	bfmls	z8.h, z12.h, z1.h\[1\]
+.*:	64200d50 	bfmls	z16.h, z10.h, z0.h\[1\]
+.*:	65028200 	bfmul	z0.h, p0\/m, z0.h, z16.h
+.*:	65028501 	bfmul	z1.h, p1\/m, z1.h, z8.h
+.*:	65028882 	bfmul	z2.h, p2\/m, z2.h, z4.h
+.*:	65029044 	bfmul	z4.h, p4\/m, z4.h, z2.h
+.*:	65029828 	bfmul	z8.h, p6\/m, z8.h, z1.h
+.*:	65029c10 	bfmul	z16.h, p7\/m, z16.h, z0.h
+.*:	65100880 	bfmul	z0.h, z4.h, z16.h
+.*:	65080901 	bfmul	z1.h, z8.h, z8.h
+.*:	65040982 	bfmul	z2.h, z12.h, z4.h
+.*:	65020a04 	bfmul	z4.h, z16.h, z2.h
+.*:	65010a88 	bfmul	z8.h, z20.h, z1.h
+.*:	65000b10 	bfmul	z16.h, z24.h, z0.h
+.*:	643e2a00 	bfmul	z0.h, z16.h, z6.h\[7\]
+.*:	643d2901 	bfmul	z1.h, z8.h, z5.h\[7\]
+.*:	643429c2 	bfmul	z2.h, z14.h, z4.h\[5\]
+.*:	642a2aa4 	bfmul	z4.h, z21.h, z2.h\[3\]
+.*:	64212988 	bfmul	z8.h, z12.h, z1.h\[1\]
+.*:	64202950 	bfmul	z16.h, z10.h, z0.h\[1\]
+.*:	65018200 	bfsub	z0.h, p0\/m, z0.h, z16.h
+.*:	65018501 	bfsub	z1.h, p1\/m, z1.h, z8.h
+.*:	65018882 	bfsub	z2.h, p2\/m, z2.h, z4.h
+.*:	65019044 	bfsub	z4.h, p4\/m, z4.h, z2.h
+.*:	65019828 	bfsub	z8.h, p6\/m, z8.h, z1.h
+.*:	65019c10 	bfsub	z16.h, p7\/m, z16.h, z0.h
+.*:	65100480 	bfsub	z0.h, z4.h, z16.h
+.*:	65080501 	bfsub	z1.h, z8.h, z8.h
+.*:	65040582 	bfsub	z2.h, z12.h, z4.h
+.*:	65020604 	bfsub	z4.h, z16.h, z2.h
+.*:	65010688 	bfsub	z8.h, z20.h, z1.h
+.*:	65000710 	bfsub	z16.h, z24.h, z0.h
diff --git a/gas/testsuite/gas/aarch64/bfloat16-1.s b/gas/testsuite/gas/aarch64/bfloat16-1.s
new file mode 100644
index 0000000000000000000000000000000000000000..5597d9ef01906f7316149cdf0bb69addeb849926
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/bfloat16-1.s
@@ -0,0 +1,112 @@
+bfadd z0.h, p0/m, z0.h, z16.h
+bfadd z1.h, p1/m, z1.h, z8.h
+bfadd z2.h, p2/m, z2.h, z4.h
+bfadd z4.h, p4/m, z4.h, z2.h
+bfadd z8.h, p6/m, z8.h, z1.h
+bfadd z16.h, p7/m, z16.h, z0.h
+
+bfmax z0.h, p0/m, z0.h, z16.h
+bfmax z1.h, p1/m, z1.h, z8.h
+bfmax z2.h, p2/m, z2.h, z4.h
+bfmax z4.h, p4/m, z4.h, z2.h
+bfmax z8.h, p6/m, z8.h, z1.h
+bfmax z16.h, p7/m, z16.h, z0.h
+
+bfmaxnm z0.h, p0/m, z0.h, z16.h
+bfmaxnm z1.h, p1/m, z1.h, z8.h
+bfmaxnm z2.h, p2/m, z2.h, z4.h
+bfmaxnm z4.h, p4/m, z4.h, z2.h
+bfmaxnm z8.h, p6/m, z8.h, z1.h
+bfmaxnm z16.h, p7/m, z16.h, z0.h
+
+bfmin z0.h, p0/m, z0.h, z16.h
+bfmin z1.h, p1/m, z1.h, z8.h
+bfmin z2.h, p2/m, z2.h, z4.h
+bfmin z4.h, p4/m, z4.h, z2.h
+bfmin z8.h, p6/m, z8.h, z1.h
+bfmin z16.h, p7/m, z16.h, z0.h
+
+bfminnm z0.h, p0/m, z0.h, z16.h
+bfminnm z1.h, p1/m, z1.h, z8.h
+bfminnm z2.h, p2/m, z2.h, z4.h
+bfminnm z4.h, p4/m, z4.h, z2.h
+bfminnm z8.h, p6/m, z8.h, z1.h
+bfminnm z16.h, p7/m, z16.h, z0.h
+
+bfadd z0.h, z4.h, z16.h
+bfadd z1.h, z8.h, z8.h
+bfadd z2.h, z12.h, z4.h
+bfadd z4.h, z16.h, z2.h
+bfadd z8.h, z20.h, z1.h
+bfadd z16.h, z24.h, z0.h
+
+bfclamp z0.h, z4.h, z16.h
+bfclamp z1.h, z8.h, z8.h
+bfclamp z2.h, z12.h, z4.h
+bfclamp z4.h, z16.h, z2.h
+bfclamp z8.h, z20.h, z1.h
+bfclamp z16.h, z24.h, z0.h
+bfmla z0.h, p0/m, z0.h, z16.h
+bfmla z1.h, p1/m, z1.h, z8.h
+bfmla z2.h, p2/m, z2.h, z4.h
+bfmla z4.h, p4/m, z4.h, z2.h
+bfmla z8.h, p6/m, z8.h, z1.h
+bfmla z16.h, p7/m, z16.h, z0.h
+
+bfmla z0.h, z16.h, z6.h[7]
+bfmla z1.h, z8.h, z5.h[6]
+bfmla z2.h, z14.h, z4.h[4]
+bfmla z4.h, z21.h, z2.h[2]
+bfmla z8.h, z12.h, z1.h[1]
+bfmla z16.h, z10.h, z0.h[0]
+
+bfmls z0.h, p0/m, z0.h, z16.h
+bfmls z1.h, p1/m, z1.h, z8.h
+bfmls z2.h, p2/m, z2.h, z4.h
+bfmls z4.h, p4/m, z4.h, z2.h
+bfmls z8.h, p6/m, z8.h, z1.h
+bfmls z16.h, p7/m, z16.h, z0.h
+
+bfmls z0.h, z16.h, z6.h[7]
+bfmls z1.h, z8.h, z5.h[6]
+bfmls z2.h, z14.h, z4.h[4]
+bfmls z4.h, z21.h, z2.h[2]
+bfmls z8.h, z12.h, z1.h[1]
+bfmls z16.h, z10.h, z0.h[0]
+
+bfmul z0.h, p0/m, z0.h, z16.h
+bfmul z1.h, p1/m, z1.h, z8.h
+bfmul z2.h, p2/m, z2.h, z4.h
+bfmul z4.h, p4/m, z4.h, z2.h
+bfmul z8.h, p6/m, z8.h, z1.h
+bfmul z16.h, p7/m, z16.h, z0.h
+
+bfmul z0.h, z4.h, z16.h
+bfmul z1.h, z8.h, z8.h
+bfmul z2.h, z12.h, z4.h
+bfmul z4.h, z16.h, z2.h
+bfmul z8.h, z20.h, z1.h
+bfmul z16.h, z24.h, z0.h
+
+bfmul z0.h, z16.h, z6.h[7]
+bfmul z1.h, z8.h, z5.h[6]
+bfmul z2.h, z14.h, z4.h[4]
+bfmul z4.h, z21.h, z2.h[2]
+bfmul z8.h, z12.h, z1.h[1]
+bfmul z16.h, z10.h, z0.h[0]
+
+bfsub z0.h, p0/m, z0.h, z16.h
+bfsub z1.h, p1/m, z1.h, z8.h
+bfsub z2.h, p2/m, z2.h, z4.h
+bfsub z4.h, p4/m, z4.h, z2.h
+bfsub z8.h, p6/m, z8.h, z1.h
+bfsub z16.h, p7/m, z16.h, z0.h
+
+bfsub z0.h, z4.h, z16.h
+bfsub z1.h, z8.h, z8.h
+bfsub z2.h, z12.h, z4.h
+bfsub z4.h, z16.h, z2.h
+bfsub z8.h, z20.h, z1.h
+bfsub z16.h, z24.h, z0.h
+
+
diff --git a/gas/testsuite/gas/aarch64/bfloat16-bad.d b/gas/testsuite/gas/aarch64/bfloat16-bad.d
new file mode 100644
index 0000000000000000000000000000000000000000..10d2b001c1a39851ab020e20997f2774663dc3ba
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/bfloat16-bad.d
@@ -0,0 +1,4 @@
+#name: Negative test of Bfloat16 instructions.
+#as: -march=armv9.4-a
+#source: bfloat16-1.s
+#error_output: bfloat16-bad.l
diff --git a/gas/testsuite/gas/aarch64/bfloat16-bad.l b/gas/testsuite/gas/aarch64/bfloat16-bad.l
new file mode 100644
index 0000000000000000000000000000000000000000..5a5192b329cd250914c860de5331ef3952ef846b
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/bfloat16-bad.l
@@ -0,0 +1,97 @@
+.*: Assembler messages:
+.*: Error: selected processor does not support `bfadd z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfadd z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfadd z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfadd z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfadd z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfadd z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmax z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmax z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmax z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmax z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmax z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmax z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmaxnm z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmaxnm z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmaxnm z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmaxnm z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmaxnm z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmaxnm z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmin z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmin z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmin z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmin z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmin z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmin z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfminnm z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfminnm z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfminnm z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfminnm z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfminnm z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfminnm z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfadd z0.h,z4.h,z16.h'
+.*: Error: selected processor does not support `bfadd z1.h,z8.h,z8.h'
+.*: Error: selected processor does not support `bfadd z2.h,z12.h,z4.h'
+.*: Error: selected processor does not support `bfadd z4.h,z16.h,z2.h'
+.*: Error: selected processor does not support `bfadd z8.h,z20.h,z1.h'
+.*: Error: selected processor does not support `bfadd z16.h,z24.h,z0.h'
+.*: Error: selected processor does not support `bfclamp z0.h,z4.h,z16.h'
+.*: Error: selected processor does not support `bfclamp z1.h,z8.h,z8.h'
+.*: Error: selected processor does not support `bfclamp z2.h,z12.h,z4.h'
+.*: Error: selected processor does not support `bfclamp z4.h,z16.h,z2.h'
+.*: Error: selected processor does not support `bfclamp z8.h,z20.h,z1.h'
+.*: Error: selected processor does not support `bfclamp z16.h,z24.h,z0.h'
+.*: Error: selected processor does not support `bfmla z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmla z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmla z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmla z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmla z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmla z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmla z0.h,z16.h,z6.h\[7\]'
+.*: Error: selected processor does not support `bfmla z1.h,z8.h,z5.h\[6\]'
+.*: Error: selected processor does not support `bfmla z2.h,z14.h,z4.h\[4\]'
+.*: Error: selected processor does not support `bfmla z4.h,z21.h,z2.h\[2\]'
+.*: Error: selected processor does not support `bfmla z8.h,z12.h,z1.h\[1\]'
+.*: Error: selected processor does not support `bfmla z16.h,z10.h,z0.h\[0\]'
+.*: Error: selected processor does not support `bfmls z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmls z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmls z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmls z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmls z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmls z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmls z0.h,z16.h,z6.h\[7\]'
+.*: Error: selected processor does not support `bfmls z1.h,z8.h,z5.h\[6\]'
+.*: Error: selected processor does not support `bfmls z2.h,z14.h,z4.h\[4\]'
+.*: Error: selected processor does not support `bfmls z4.h,z21.h,z2.h\[2\]'
+.*: Error: selected processor does not support `bfmls z8.h,z12.h,z1.h\[1\]'
+.*: Error: selected processor does not support `bfmls z16.h,z10.h,z0.h\[0\]'
+.*: Error: selected processor does not support `bfmul z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmul z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmul z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmul z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmul z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmul z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmul z0.h,z4.h,z16.h'
+.*: Error: selected processor does not support `bfmul z1.h,z8.h,z8.h'
+.*: Error: selected processor does not support `bfmul z2.h,z12.h,z4.h'
+.*: Error: selected processor does not support `bfmul z4.h,z16.h,z2.h'
+.*: Error: selected processor does not support `bfmul z8.h,z20.h,z1.h'
+.*: Error: selected processor does not support `bfmul z16.h,z24.h,z0.h'
+.*: Error: selected processor does not support `bfmul z0.h,z16.h,z6.h\[7\]'
+.*: Error: selected processor does not support `bfmul z1.h,z8.h,z5.h\[6\]'
+.*: Error: selected processor does not support `bfmul z2.h,z14.h,z4.h\[4\]'
+.*: Error: selected processor does not support `bfmul z4.h,z21.h,z2.h\[2\]'
+.*: Error: selected processor does not support `bfmul z8.h,z12.h,z1.h\[1\]'
+.*: Error: selected processor does not support `bfmul z16.h,z10.h,z0.h\[0\]'
+.*: Error: selected processor does not support `bfsub z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfsub z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfsub z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfsub z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfsub z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfsub z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfsub z0.h,z4.h,z16.h'
+.*: Error: selected processor does not support `bfsub z1.h,z8.h,z8.h'
+.*: Error: selected processor does not support `bfsub z2.h,z12.h,z4.h'
+.*: Error: selected processor does not support `bfsub z4.h,z16.h,z2.h'
+.*: Error: selected processor does not support `bfsub z8.h,z20.h,z1.h'
+.*: Error: selected processor does not support `bfsub z16.h,z24.h,z0.h'
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index 9d64d7a0ebefa4014f30a46c5be7bda124666327..e2ca92361b46a27f67d315d155eb3a9608176cb7 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -222,6 +222,8 @@ enum aarch64_feature_bit {
   AARCH64_FEATURE_PMUv3_ICNTR,
   /* Performance Monitors Synchronous-Exception-Based Event Extension.  */
   AARCH64_FEATURE_SEBEP,
+  /* SVE2.1 and SME2.1 non-widening BFloat16 instructions.  */
+  AARCH64_FEATURE_B16B16,
   AARCH64_NUM_FEATURES
 };
 
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index 0cf195d03216a38e1a9b5e06b80af064e2440b91..a8ccdafd044efd62d11ba1e4c199792f6dd44559 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1761,6 +1761,10 @@
 {                                                       \
   QLF3(S_S,NIL,S_S),                                    \
 }
+#define OP_SVE_SMSS				     \
+{						       \
+  QLF4(S_H,P_M,S_H,S_H),				\
+}
 #define OP_SVE_SUU                                      \
 {                                                       \
   QLF3(S_S,NIL,NIL),                                    \
@@ -2608,6 +2612,8 @@ static const aarch64_feature_set aarch64_feature_the =
   AARCH64_FEATURE (THE);
 static const aarch64_feature_set aarch64_feature_d128_the =
   AARCH64_FEATURES (2, D128, THE);
+static const aarch64_feature_set aarch64_feature_b16b16 =
+  AARCH64_FEATURE (B16B16);
 
 #define CORE		&aarch64_feature_v8
 #define FP		&aarch64_feature_fp
@@ -2670,6 +2676,7 @@ static const aarch64_feature_set aarch64_feature_d128_the =
 #define D128	  &aarch64_feature_d128
 #define THE	  &aarch64_feature_the
 #define D128_THE  &aarch64_feature_d128_the
+#define B16B16  &aarch64_feature_b16b16
 
 #define CORE_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS) \
   { NAME, OPCODE, MASK, CLASS, OP, CORE, OPS, QUALS, FLAGS, 0, 0, NULL }
@@ -2739,6 +2746,12 @@ static const aarch64_feature_set aarch64_feature_d128_the =
 #define SVE2_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
   { NAME, OPCODE, MASK, CLASS, OP, SVE2, OPS, QUALS, \
     FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
+#define B16B16_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
+  { NAME, OPCODE, MASK, CLASS, OP, B16B16, OPS, QUALS, \
+    FLAGS | F_STRICT, 0, TIED, NULL }
+#define B16B16_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
+  { NAME, OPCODE, MASK, CLASS, OP, B16B16, OPS, QUALS, \
+    FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
 #define SVE2AES_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
   { NAME, OPCODE, MASK, CLASS, OP, SVE2_AES, OPS, QUALS, \
     FLAGS | F_STRICT, 0, TIED, NULL }
@@ -6258,6 +6271,24 @@ const struct aarch64_opcode aarch64_opcode_table[] =
   D128_THE_INSN("rcwsswppal", 0x59e0a000, 0xffe0fc00, OP3 (Rt, Rs, ADDR_SIMPLE), QL_X2NIL, 0),
   D128_THE_INSN("rcwsswppl", 0x5960a000, 0xffe0fc00, OP3 (Rt, Rs, ADDR_SIMPLE), QL_X2NIL, 0),
 
+/* BFloat16 SVE Instructions.  */
+  B16B16_INSNC("bfadd", 0x65008000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+  B16B16_INSNC("bfmax", 0x65068000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+  B16B16_INSNC("bfmaxnm", 0x65048000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+  B16B16_INSNC("bfmin", 0x65078000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+  B16B16_INSNC("bfminnm", 0x65058000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+  B16B16_INSNC("bfmla", 0x65200000, 0xffe0e000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zn, SVE_Zm_16), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+  B16B16_INSNC("bfmls", 0x65202000, 0xffe0e000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zn, SVE_Zm_16), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+  B16B16_INSN("bfadd", 0x65000000, 0xffe0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_HHH, 0, 0),
+  B16B16_INSN("bfclamp", 0x64202400, 0xffe0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_HHH, 0, 0),
+  B16B16_INSNC("bfmul", 0x65028000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+  B16B16_INSN("bfmul", 0x65000800, 0xffe0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_HHH, 0, 0),
+  B16B16_INSNC("bfsub", 0x65018000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+  B16B16_INSN("bfsub", 0x65000400, 0xffe0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_HHH, 0, 0),
+  B16B16_INSN("bfmla", 0x64200800, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
+  B16B16_INSN("bfmls", 0x64200c00, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
+  B16B16_INSN("bfmul", 0x64202800, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
+
   {0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
 };
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions.
  2024-01-15  9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
@ 2024-01-15  9:34 ` Srinath Parvathaneni
  2024-01-15  9:35   ` [PATCH 3/6][Binutils] aarch64: Add support for FEAT_SVE2p1 Srinath Parvathaneni
  2024-01-15  9:37 ` [PATCH 4/6][Binutils] aarch64: Add SVE2.1 dupq, eorqv and extq instructions Srinath Parvathaneni
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15  9:34 UTC (permalink / raw)
  To: binutils; +Cc: richard.earnshaw, nickc

[-- Attachment #1: Type: text/plain, Size: 374 bytes --]

Hi,

This patch add support for FEAT_SME2p1 and "movaz" instructions
along with the optional flag +sme2p1.

Following "movaz" instructions are add:
Move and zero two ZA tile slices to vector registers.
Move and zero four ZA tile slices to vector registers.

Regression testing for aarch64-none-elf target and found no regressions.

Ok for binutils-master?

Regards,
Srinath.

[-- Attachment #2: 2_6.patch --]
[-- Type: text/x-patch, Size: 24758 bytes --]

diff --git a/gas/NEWS b/gas/NEWS
index 74df7e61349626926bec1626aa5d89629c5d6c4a..d2c5c0641c4392d66472e535e5f51756fc5d511f 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,8 @@
 -*- text -*-
 
+* Add support for the AArch64 Scalable Matrix Extension version 2.1 (SME2.1)
+  instructions.
+
 * Add support for 'armv8.9-a' and 'armv9.4-a' for -march in Arm GAS.
 
 * Initial support for Intel APX: 32 GPRs, NDD, PUSH2/POP2 and PUSHP/POPP.
diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index bc40d126632e093b02268fd7474f4cf0c6ddf6d7..34159c2168b78fe12d4a549678ae77be88e50313 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -4492,6 +4492,7 @@ parse_sme_immediate (char **str, int64_t *imm)
 
    [<Wv>, <imm>]
    [<Wv>, #<imm>]
+   [<Ws>, <offsf>:<offsl>]
 
    Return true on success, populating OPND with the parsed index.  */
 
@@ -4592,6 +4593,7 @@ parse_sme_za_index (char **str, struct aarch64_indexed_za *opnd)
    <Pm>.<T>[<Wv>< #<imm>]
    ZA[<Wv>, #<imm>]
    <ZAn><HV>.<T>[<Wv>, #<imm>]
+   <ZAn><HV>.<T>[<Ws>, <offsf>:<offsl>]
 
    FLAGS is as for parse_typed_reg.  */
 
@@ -7865,6 +7867,21 @@ parse_operands (char *str, const aarch64_opcode *opcode)
 	  info->qualifier = qualifier;
 	  break;
 
+	case AARCH64_OPND_SME_ZA_array_vrsb_1:
+	case AARCH64_OPND_SME_ZA_array_vrsh_1:
+	case AARCH64_OPND_SME_ZA_array_vrss_1:
+	case AARCH64_OPND_SME_ZA_array_vrsd_1:
+	case AARCH64_OPND_SME_ZA_array_vrsb_2:
+	case AARCH64_OPND_SME_ZA_array_vrsh_2:
+	case AARCH64_OPND_SME_ZA_array_vrss_2:
+	case AARCH64_OPND_SME_ZA_array_vrsd_2:
+	  if (!parse_dual_indexed_reg (&str, REG_TYPE_ZATHV,
+				       &info->indexed_za, &qualifier, 0))
+	    goto failure;
+	  info->qualifier = qualifier;
+	  break;
+
+
 	case AARCH64_OPND_SME_VLxN_10:
 	case AARCH64_OPND_SME_VLxN_13:
 	  po_strict_enum_or_fail (aarch64_sme_vlxn_array);
@@ -10336,6 +10353,7 @@ static const struct aarch64_option_cpu_value_table aarch64_features[] = {
   {"d128",		AARCH64_FEATURE (D128),
 			AARCH64_FEATURE (LSE128)},
   {"b16b16",		AARCH64_FEATURE (B16B16), AARCH64_FEATURE (SVE2)},
+  {"sme2p1",		AARCH64_FEATURE (SME2p1), AARCH64_FEATURE (SME2)},
   {NULL,		AARCH64_NO_FEATURES, AARCH64_NO_FEATURES},
 };
 
diff --git a/gas/doc/c-aarch64.texi b/gas/doc/c-aarch64.texi
index ccf18ee2661fed0bb89608d7f122556cb541e5b5..1f3a4fbcafbab652d70f202b33527ae3dde7e1bb 100644
--- a/gas/doc/c-aarch64.texi
+++ b/gas/doc/c-aarch64.texi
@@ -276,6 +276,8 @@ automatically cause those extensions to be disabled.
  @tab Enable TRCIT instruction.
 @item @code{d128} @tab Armv9.4-A @tab No
  @tab Enable the 128-bit Page Descriptor Extension.  This implies @code{lse128}.
+@item @code{sme2p1} @tab N/A @tab No
+ @tab Enable the SME2.1 Extension.
 @end multitable
 
 @node AArch64 Syntax
diff --git a/gas/testsuite/gas/aarch64/sme2p1-1.d b/gas/testsuite/gas/aarch64/sme2p1-1.d
new file mode 100644
index 0000000000000000000000000000000000000000..a6e7b7664024e7f03ddd1d8ece9d6c3bd1c79042
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sme2p1-1.d
@@ -0,0 +1,42 @@
+#name: Test of SME2.1 movaz instructions.
+#as: -march=armv9.4-a+sme2p1
+#objdump: -dr
+
+[^:]+:     file format .*
+
+
+[^:]+:
+
+[^:]+:
+.*:	c006c260 	movaz	{z0.b-z1.b}, za0v.b \[w14, 6:7\]
+.*:	c046c260 	movaz	{z0.h-z1.h}, za0v.h \[w14, 6:7\]
+.*:	c086c220 	movaz	{z0.s-z1.s}, za0v.s \[w14, 2:3\]
+.*:	c0c6c200 	movaz	{z0.d-z1.d}, za0v.d \[w14, 0:1\]
+.*:	c00602e0 	movaz	{z0.b-z1.b}, za0h.b \[w12, 14:15\]
+.*:	c0462260 	movaz	{z0.h-z1.h}, za0h.h \[w13, 6:7\]
+.*:	c0864220 	movaz	{z0.s-z1.s}, za0h.s \[w14, 2:3\]
+.*:	c0c66200 	movaz	{z0.d-z1.d}, za0h.d \[w15, 0:1\]
+.*:	c006c260 	movaz	{z0.b-z1.b}, za0v.b \[w14, 6:7\]
+.*:	c046c2e0 	movaz	{z0.h-z1.h}, za1v.h \[w14, 6:7\]
+.*:	c086c2a0 	movaz	{z0.s-z1.s}, za2v.s \[w14, 2:3\]
+.*:	c0c6c260 	movaz	{z0.d-z1.d}, za3v.d \[w14, 0:1\]
+.*:	c00602e0 	movaz	{z0.b-z1.b}, za0h.b \[w12, 14:15\]
+.*:	c04622e0 	movaz	{z0.h-z1.h}, za1h.h \[w13, 6:7\]
+.*:	c08642a0 	movaz	{z0.s-z1.s}, za2h.s \[w14, 2:3\]
+.*:	c0c66260 	movaz	{z0.d-z1.d}, za3h.d \[w15, 0:1\]
+.*:	c006c660 	movaz	{z0.b-z3.b}, za0v.b \[w14, 12:15\]
+.*:	c046c620 	movaz	{z0.h-z3.h}, za0v.h \[w14, 4:7\]
+.*:	c086c600 	movaz	{z0.s-z3.s}, za0v.s \[w14, 0:3\]
+.*:	c0c6c600 	movaz	{z0.d-z3.d}, za0v.d \[w14, 0:3\]
+.*:	c0060660 	movaz	{z0.b-z3.b}, za0h.b \[w12, 12:15\]
+.*:	c0462620 	movaz	{z0.h-z3.h}, za0h.h \[w13, 4:7\]
+.*:	c0864600 	movaz	{z0.s-z3.s}, za0h.s \[w14, 0:3\]
+.*:	c0c66600 	movaz	{z0.d-z3.d}, za0h.d \[w15, 0:3\]
+.*:	c006c640 	movaz	{z0.b-z3.b}, za0v.b \[w14, 8:11\]
+.*:	c046c660 	movaz	{z0.h-z3.h}, za1v.h \[w14, 4:7\]
+.*:	c086c640 	movaz	{z0.s-z3.s}, za2v.s \[w14, 0:3\]
+.*:	c0c6c660 	movaz	{z0.d-z3.d}, za3v.d \[w14, 0:3\]
+.*:	c0060660 	movaz	{z0.b-z3.b}, za0h.b \[w12, 12:15\]
+.*:	c0462660 	movaz	{z0.h-z3.h}, za1h.h \[w13, 4:7\]
+.*:	c0864640 	movaz	{z0.s-z3.s}, za2h.s \[w14, 0:3\]
+.*:	c0c66660 	movaz	{z0.d-z3.d}, za3h.d \[w15, 0:3\]
diff --git a/gas/testsuite/gas/aarch64/sme2p1-1.s b/gas/testsuite/gas/aarch64/sme2p1-1.s
new file mode 100644
index 0000000000000000000000000000000000000000..77481d4b874b4688e10c794e6ea9e1ff0c81ef3d
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sme2p1-1.s
@@ -0,0 +1,39 @@
+	movaz {z0.b - z1.b}, ZA0V.B [w14, 6:7]
+	movaz {z0.h - z1.h}, ZA0V.H [w14, 6:7]
+	movaz {z0.s - z1.s}, ZA0V.S [w14, 2:3]
+	movaz {z0.d - z1.d}, ZA0V.D [w14, 0:1]
+
+	movaz {z0.b - z1.b}, ZA0H.B [w12, 14:15]
+	movaz {z0.h - z1.h}, ZA0H.H [w13, 6:7]
+	movaz {z0.s - z1.s}, ZA0H.S [w14, 2:3]
+	movaz {z0.d - z1.d}, ZA0H.D [w15, 0:1]
+
+	movaz {z0.b - z1.b}, ZA0V.B [w14, 6:7]
+	movaz {z0.h - z1.h}, ZA1V.H [w14, 6:7]
+	movaz {z0.s - z1.s}, ZA2V.S [w14, 2:3]
+	movaz {z0.d - z1.d}, ZA3V.D [w14, 0:1]
+
+	movaz {z0.b - z1.b}, ZA0H.B [w12, 14:15]
+	movaz {z0.h - z1.h}, ZA1H.H [w13, 6:7]
+	movaz {z0.s - z1.s}, ZA2H.S [w14, 2:3]
+	movaz {z0.d - z1.d}, ZA3H.D [w15, 0:1]
+
+	movaz {z0.b - z3.b}, ZA0V.B [w14, 12:15]
+	movaz {z0.h - z3.h}, ZA0V.H [w14, 4:7]
+	movaz {z0.s - z3.s}, ZA0V.S [w14, 0:3]
+	movaz {z0.d - z3.d}, ZA0V.D [w14, 0:3]
+
+	movaz {z0.b - z3.b}, ZA0H.B [w12, 12:15]
+	movaz {z0.h - z3.h}, ZA0H.H [w13, 4:7]
+	movaz {z0.s - z3.s}, ZA0H.S [w14, 0:3]
+	movaz {z0.d - z3.d}, ZA0H.D [w15, 0:3]
+
+	movaz {z0.b - z3.b}, ZA0V.B [w14, 8:11]
+	movaz {z0.h - z3.h}, ZA1V.H [w14, 4:7]
+	movaz {z0.s - z3.s}, ZA2V.S [w14, 0:3]
+	movaz {z0.d - z3.d}, ZA3V.D [w14, 0:3]
+
+	movaz {z0.b - z3.b}, ZA0H.B [w12, 12:15]
+	movaz {z0.h - z3.h}, ZA1H.H [w13, 4:7]
+	movaz {z0.s - z3.s}, ZA2H.S [w14, 0:3]
+	movaz {z0.d - z3.d}, ZA3H.D [w15, 0:3]
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index e2ca92361b46a27f67d315d155eb3a9608176cb7..648e25f3e4242bb738eee5f62079838784223b8a 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -224,6 +224,8 @@ enum aarch64_feature_bit {
   AARCH64_FEATURE_SEBEP,
   /* SVE2.1 and SME2.1 non-widening BFloat16 instructions.  */
   AARCH64_FEATURE_B16B16,
+  /* SME2.1 instructions.  */
+  AARCH64_FEATURE_SME2p1,
   AARCH64_NUM_FEATURES
 };
 
@@ -705,6 +707,14 @@ enum aarch64_opnd
   AARCH64_OPND_SVE_Vd,		/* Scalar SIMD&FP register in Vd.  */
   AARCH64_OPND_SVE_Vm,		/* Scalar SIMD&FP register in Vm.  */
   AARCH64_OPND_SVE_Vn,		/* Scalar SIMD&FP register in Vn.  */
+  AARCH64_OPND_SME_ZA_array_vrsb_1, /* Tile to vector, two registers (B).  */
+  AARCH64_OPND_SME_ZA_array_vrsh_1, /* Tile to vector, two registers (H).  */
+  AARCH64_OPND_SME_ZA_array_vrss_1, /* Tile to vector, two registers (S).  */
+  AARCH64_OPND_SME_ZA_array_vrsd_1, /* Tile to vector, two registers (D).  */
+  AARCH64_OPND_SME_ZA_array_vrsb_2, /* Tile to vector, four registers (B).  */
+  AARCH64_OPND_SME_ZA_array_vrsh_2, /* Tile to vector, four registers (H).  */
+  AARCH64_OPND_SME_ZA_array_vrss_2, /* Tile to vector, four registers (S). */
+  AARCH64_OPND_SME_ZA_array_vrsd_2, /* Tile to vector, four registers (D).  */
   AARCH64_OPND_SVE_Za_5,	/* SVE vector register in Za, bits [9,5].  */
   AARCH64_OPND_SVE_Za_16,	/* SVE vector register in Za, bits [20,16].  */
   AARCH64_OPND_SVE_Zd,		/* SVE vector register in Zd.  */
@@ -962,6 +972,7 @@ enum aarch64_insn_class
   sme_start,
   sme_stop,
   sme2_mov,
+  sme2_movaz,
   sve_cpy,
   sve_index,
   sve_limm,
diff --git a/opcodes/aarch64-asm.h b/opcodes/aarch64-asm.h
index a3bf7bda0130f06623b0b56c18d9eca632b24a3a..d4b6407dc5de8d6e103ee8ca5b5f2c6bb814647f 100644
--- a/opcodes/aarch64-asm.h
+++ b/opcodes/aarch64-asm.h
@@ -100,6 +100,8 @@ AARCH64_DECL_OPD_INSERTER (ins_sve_strided_reglist);
 AARCH64_DECL_OPD_INSERTER (ins_sve_scale);
 AARCH64_DECL_OPD_INSERTER (ins_sve_shlimm);
 AARCH64_DECL_OPD_INSERTER (ins_sve_shrimm);
+AARCH64_DECL_OPD_INSERTER (ins_sme_za_vrs1);
+AARCH64_DECL_OPD_INSERTER (ins_sme_za_vrs2);
 AARCH64_DECL_OPD_INSERTER (ins_sme_za_hv_tiles);
 AARCH64_DECL_OPD_INSERTER (ins_sme_za_hv_tiles_range);
 AARCH64_DECL_OPD_INSERTER (ins_sme_za_list);
diff --git a/opcodes/aarch64-asm.c b/opcodes/aarch64-asm.c
index 1db290eea7e9d23893bfd2bd0a07c13392f6c9f6..3fac127a5899077e2ac19c5e98df737b8ffbe147 100644
--- a/opcodes/aarch64-asm.c
+++ b/opcodes/aarch64-asm.c
@@ -1375,6 +1375,76 @@ aarch64_ins_sve_float_zero_one (const aarch64_operand *self,
   return true;
 }
 
+bool
+aarch64_ins_sme_za_vrs1 (const aarch64_operand *self,
+			     const aarch64_opnd_info *info,
+			     aarch64_insn *code,
+			     const aarch64_inst *inst ATTRIBUTE_UNUSED,
+			     aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+  int za_reg = info->indexed_za.regno;
+  int regno = info->indexed_za.index.regno & 3;
+  int imm = info->indexed_za.index.imm;
+  int v =  info->indexed_za.v;
+  int countm1 = info->indexed_za.index.countm1;
+
+  insert_field (self->fields[0], code, v, 0);
+  insert_field (self->fields[1], code, regno, 0);
+  switch (info->qualifier)
+    {
+    case AARCH64_OPND_QLF_S_B:
+      insert_field (self->fields[2], code, imm / (countm1 + 1), 0);
+      break;
+    case AARCH64_OPND_QLF_S_H:
+    case AARCH64_OPND_QLF_S_S:
+      insert_field (self->fields[2], code, za_reg, 0);
+      insert_field (self->fields[3], code, imm / (countm1 + 1), 0);
+      break;
+    case AARCH64_OPND_QLF_S_D:
+      insert_field (self->fields[2], code, za_reg, 0);
+      break;
+    default:
+      return false;
+    }
+
+  return true;
+}
+
+bool
+aarch64_ins_sme_za_vrs2 (const aarch64_operand *self,
+			     const aarch64_opnd_info *info,
+			     aarch64_insn *code,
+			     const aarch64_inst *inst ATTRIBUTE_UNUSED,
+			     aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+  int za_reg = info->indexed_za.regno;
+  int regno = info->indexed_za.index.regno & 3;
+  int imm = info->indexed_za.index.imm;
+  int v =  info->indexed_za.v;
+  int countm1 = info->indexed_za.index.countm1;
+
+  insert_field (self->fields[0], code, v, 0);
+  insert_field (self->fields[1], code, regno, 0);
+  switch (info->qualifier)
+    {
+    case AARCH64_OPND_QLF_S_B:
+      insert_field (self->fields[2], code, imm / (countm1 + 1), 0);
+      break;
+    case AARCH64_OPND_QLF_S_H:
+      insert_field (self->fields[2], code, za_reg, 0);
+      insert_field (self->fields[3], code, imm / (countm1 + 1), 0);
+      break;
+    case AARCH64_OPND_QLF_S_S:
+    case AARCH64_OPND_QLF_S_D:
+      insert_field (self->fields[2], code, za_reg, 0);
+      break;
+    default:
+      return false;
+    }
+
+  return true;
+}
+
 /* Encode in SME instruction such as MOVA ZA tile vector register number,
    vector indicator, vector selector and immediate.  */
 bool
@@ -2011,6 +2081,7 @@ aarch64_encode_variant_using_iclass (struct aarch64_inst *inst)
       break;
 
     case sme_misc:
+    case sme2_movaz:
     case sve_misc:
       /* These instructions have only a single variant.  */
       break;
diff --git a/opcodes/aarch64-dis.h b/opcodes/aarch64-dis.h
index 20387db7b39e98c081ff88c8c73b437f0b50fd01..9a38c1ab50f7fdb27588c7451ade19c166e69c96 100644
--- a/opcodes/aarch64-dis.h
+++ b/opcodes/aarch64-dis.h
@@ -124,6 +124,8 @@ AARCH64_DECL_OPD_EXTRACTOR (ext_sve_strided_reglist);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sve_scale);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sve_shlimm);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sve_shrimm);
+AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_vrs1);
+AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_vrs2);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_hv_tiles);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_hv_tiles_range);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_list);
diff --git a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c
index 7e088a93c107b6152b2df1bb622516d09fce3839..a14b2ca02d2c0fcf6c94f5bb0c587a7168594a5b 100644
--- a/opcodes/aarch64-dis.c
+++ b/opcodes/aarch64-dis.c
@@ -1929,6 +1929,84 @@ aarch64_ext_sme_za_array (const aarch64_operand *self,
   return true;
 }
 
+/* Decode two ZA tile slice (V, Rv, off3| ZAn ,off2 | ZAn, ol| ZAn) feilds.  */
+bool
+aarch64_ext_sme_za_vrs1 (const aarch64_operand *self,
+			  aarch64_opnd_info *info, aarch64_insn code,
+			  const aarch64_inst *inst,
+			  aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+  int v = extract_field (self->fields[0], code, 0);
+  int regno = 12 + extract_field (self->fields[1], code, 0);
+  int imm, za_reg, num_offset = 2;
+
+  switch (info->qualifier)
+    {
+    case AARCH64_OPND_QLF_S_B:
+      imm = extract_field (self->fields[2], code, 0);
+      info->indexed_za.index.imm = imm * num_offset;
+      break;
+    case AARCH64_OPND_QLF_S_H:
+    case AARCH64_OPND_QLF_S_S:
+      za_reg = extract_field (self->fields[2], code, 0);
+      imm = extract_field (self->fields[3], code, 0);
+      info->indexed_za.index.imm = imm * num_offset;
+      info->indexed_za.regno = za_reg;
+      break;
+    case AARCH64_OPND_QLF_S_D:
+      za_reg = extract_field (self->fields[2], code, 0);
+      info->indexed_za.regno = za_reg;
+      break;
+    default:
+      return false;
+    }
+
+  info->indexed_za.index.regno = regno;
+  info->indexed_za.index.countm1 = num_offset - 1;
+  info->indexed_za.v = v;
+  info->indexed_za.group_size = get_opcode_dependent_value (inst->opcode);
+  return true;
+}
+
+/* Decode four ZA tile slice (V, Rv, off3| ZAn ,off2 | ZAn, ol| ZAn) feilds.  */
+bool
+aarch64_ext_sme_za_vrs2 (const aarch64_operand *self,
+			  aarch64_opnd_info *info, aarch64_insn code,
+			  const aarch64_inst *inst,
+			  aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+  int v = extract_field (self->fields[0], code, 0);
+  int regno = 12 + extract_field (self->fields[1], code, 0);
+  int imm, za_reg, num_offset =4;
+
+  switch (info->qualifier)
+    {
+    case AARCH64_OPND_QLF_S_B:
+      imm = extract_field (self->fields[2], code, 0);
+      info->indexed_za.index.imm = imm * num_offset;
+      break;
+    case AARCH64_OPND_QLF_S_H:
+      za_reg = extract_field (self->fields[2], code, 0);
+      imm = extract_field (self->fields[3], code, 0);
+      info->indexed_za.index.imm = imm * num_offset;
+      info->indexed_za.regno = za_reg;
+      break;
+    case AARCH64_OPND_QLF_S_S:
+    case AARCH64_OPND_QLF_S_D:
+      za_reg = extract_field (self->fields[2], code, 0);
+      info->indexed_za.regno = za_reg;
+      break;
+    default:
+      return false;
+    }
+
+  info->indexed_za.index.regno = regno;
+  info->indexed_za.index.countm1 = num_offset - 1;
+  info->indexed_za.v = v;
+  info->indexed_za.group_size = get_opcode_dependent_value (inst->opcode);
+  return true;
+}
+
 bool
 aarch64_ext_sme_addr_ri_u4xvl (const aarch64_operand *self,
                                aarch64_opnd_info *info, aarch64_insn code,
@@ -3160,6 +3238,7 @@ aarch64_decode_variant_using_iclass (aarch64_inst *inst)
       variant = 3;
       break;
 
+    case sme2_movaz:
     case sme_misc:
     case sve_misc:
       /* These instructions have only a single variant.  */
diff --git a/opcodes/aarch64-opc.h b/opcodes/aarch64-opc.h
index f193a90ecc5993645fed573a9a0670aeb4fbf781..587775152e3ef26cb3d09e138eda2791b95cb5d9 100644
--- a/opcodes/aarch64-opc.h
+++ b/opcodes/aarch64-opc.h
@@ -210,6 +210,13 @@ enum aarch64_field_kind
   FLD_sz,
   FLD_type,
   FLD_vldst_size,
+  FLD_off3,
+  FLD_off2,
+  FLD_ZAn_1,
+  FLD_ol,
+  FLD_ZAn_2,
+  FLD_ZAn_3,
+  FLD_ZAn
 };
 
 /* Field description.  */
diff --git a/opcodes/aarch64-opc.c b/opcodes/aarch64-opc.c
index e3ad32f5a1e070fe1cc464e1c0df2b0f4347f45f..cf76871930f9f4e8613a977efb81464dce3d8ba7 100644
--- a/opcodes/aarch64-opc.c
+++ b/opcodes/aarch64-opc.c
@@ -400,6 +400,16 @@ const aarch64_field fields[] =
     { 22,  1 }, /* sz: 1-bit element size select.  */
     { 22,  2 },	/* type: floating point type field in fp data inst.  */
     { 10,  2 },	/* vldst_size: size field in the AdvSIMD load/store inst.  */
+    {  5,  3 }, /* off3: immediate offset used to calculate slice number in a
+		   ZA tile.  */
+    {  5,  2 }, /* off2: immediate offset used to calculate slice number in
+		   a ZA tile.  */
+    {  7,  1 }, /* ZAn_1: name of the 1bit encoded ZA tile.  */
+    {  5,  1 }, /* ol: immediate offset used to calculate slice number in a ZA
+		   tile.  */
+    {  6,  2 }, /* ZAn_2: name of the 2bit encoded ZA tile.  */
+    {  5,  3 }, /* ZAn_3: name of the 3bit encoded ZA tile.  */
+    {  6,  1 }, /* ZAn: name of the bit encoded ZA tile.  */
 };
 
 enum aarch64_operand_class
@@ -1938,6 +1948,49 @@ operand_general_constraint_met_p (const aarch64_opnd_info *opnds, int idx,
 	    return 0;
 	  break;
 
+	case AARCH64_OPND_SME_ZA_array_vrsb_1:
+	  if (!check_za_access (opnd, mismatch_detail, idx, 12, 7, 2,
+				get_opcode_dependent_value (opcode)))
+	    return 0;
+	  break;
+
+	case AARCH64_OPND_SME_ZA_array_vrsh_1:
+	  if (!check_za_access (opnd, mismatch_detail, idx, 12, 3, 2,
+				get_opcode_dependent_value (opcode)))
+	    return 0;
+	  break;
+
+	case AARCH64_OPND_SME_ZA_array_vrss_1:
+	  if (!check_za_access (opnd, mismatch_detail, idx, 12, 1, 2,
+				get_opcode_dependent_value (opcode)))
+	    return 0;
+	  break;
+
+	case AARCH64_OPND_SME_ZA_array_vrsd_1:
+	  if (!check_za_access (opnd, mismatch_detail, idx, 12, 0, 2,
+				get_opcode_dependent_value (opcode)))
+	    return 0;
+	  break;
+
+	case AARCH64_OPND_SME_ZA_array_vrsb_2:
+	  if (!check_za_access (opnd, mismatch_detail, idx, 12, 3, 4,
+				get_opcode_dependent_value (opcode)))
+	    return 0;
+	  break;
+
+	case AARCH64_OPND_SME_ZA_array_vrsh_2:
+	  if (!check_za_access (opnd, mismatch_detail, idx, 12, 1, 4,
+				get_opcode_dependent_value (opcode)))
+	    return 0;
+	  break;
+
+	case AARCH64_OPND_SME_ZA_array_vrss_2:
+	case AARCH64_OPND_SME_ZA_array_vrsd_2:
+	  if (!check_za_access (opnd, mismatch_detail, idx, 12, 0, 4,
+				get_opcode_dependent_value (opcode)))
+	    return 0;
+	  break;
+
 	case AARCH64_OPND_SME_ZA_HV_idx_srcxN:
 	case AARCH64_OPND_SME_ZA_HV_idx_destxN:
 	  size = aarch64_get_qualifier_esize (opnd->qualifier);
@@ -4103,6 +4156,30 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
 		? style_sub_mnem (styler, "vgx4") : "");
       break;
 
+    case AARCH64_OPND_SME_ZA_array_vrsb_1:
+    case AARCH64_OPND_SME_ZA_array_vrsh_1:
+    case AARCH64_OPND_SME_ZA_array_vrss_1:
+    case AARCH64_OPND_SME_ZA_array_vrsd_1:
+    case AARCH64_OPND_SME_ZA_array_vrsb_2:
+    case AARCH64_OPND_SME_ZA_array_vrsh_2:
+    case AARCH64_OPND_SME_ZA_array_vrss_2:
+    case AARCH64_OPND_SME_ZA_array_vrsd_2:
+      snprintf (buf, size, "%s [%s, %s%s%s]",
+		style_reg (styler, "za%d%c%s%s",
+			   opnd->indexed_za.regno,
+			   opnd->indexed_za.v ? 'v': 'h',
+			   opnd->qualifier == AARCH64_OPND_QLF_NIL ? "" : ".",
+			   (opnd->qualifier == AARCH64_OPND_QLF_NIL
+			    ? ""
+			    : aarch64_get_qualifier_name (opnd->qualifier))),
+		style_reg (styler, "w%d", opnd->indexed_za.index.regno),
+		style_imm (styler, "%" PRIi64, opnd->indexed_za.index.imm),
+		opnd->indexed_za.index.countm1 ? ":" : "",
+		opnd->indexed_za.index.countm1  ? style_imm (styler, "%d",
+		opnd->indexed_za.index.imm
+		+ opnd->indexed_za.index.countm1):"");
+      break;
+
     case AARCH64_OPND_SME_SM_ZA:
       snprintf (buf, size, "%s",
 		style_reg (styler, opnd->reg.regno == 's' ? "sm" : "za"));
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index a8ccdafd044efd62d11ba1e4c199792f6dd44559..9c7648b0a6df5444cc89f52aef3d455e624eedbb 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1497,6 +1497,10 @@
 {                                                       \
   QLF2(S_B,S_B),                                        \
 }
+#define OP_SVE_HH					\
+{							\
+  QLF2(S_H,S_H),					\
+}
 #define OP_SVE_BBU                                      \
 {                                                       \
   QLF3(S_B,S_B,NIL),                                \
@@ -2614,6 +2618,8 @@ static const aarch64_feature_set aarch64_feature_d128_the =
   AARCH64_FEATURES (2, D128, THE);
 static const aarch64_feature_set aarch64_feature_b16b16 =
   AARCH64_FEATURE (B16B16);
+static const aarch64_feature_set aarch64_feature_sme2p1 =
+  AARCH64_FEATURE (SME2p1);
 
 #define CORE		&aarch64_feature_v8
 #define FP		&aarch64_feature_fp
@@ -2677,6 +2683,7 @@ static const aarch64_feature_set aarch64_feature_b16b16 =
 #define THE	  &aarch64_feature_the
 #define D128_THE  &aarch64_feature_d128_the
 #define B16B16  &aarch64_feature_b16b16
+#define SME2p1  &aarch64_feature_sme2p1
 
 #define CORE_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS) \
   { NAME, OPCODE, MASK, CLASS, OP, CORE, OPS, QUALS, FLAGS, 0, 0, NULL }
@@ -2743,6 +2750,9 @@ static const aarch64_feature_set aarch64_feature_b16b16 =
 #define SVE2_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
   { NAME, OPCODE, MASK, CLASS, OP, SVE2, OPS, QUALS, \
     FLAGS | F_STRICT, 0, TIED, NULL }
+#define SME2p1_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
+  { NAME, OPCODE, MASK, CLASS, OP, SME2p1, OPS, QUALS, \
+    FLAGS | F_STRICT, 0, TIED, NULL }
 #define SVE2_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
   { NAME, OPCODE, MASK, CLASS, OP, SVE2, OPS, QUALS, \
     FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
@@ -6289,6 +6299,16 @@ const struct aarch64_opcode aarch64_opcode_table[] =
   B16B16_INSN("bfmls", 0x64200c00, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
   B16B16_INSN("bfmul", 0x64202800, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
 
+/* SME2.1 movaz instructions.  */
+  SME2p1_INSN ("movaz", 0xc0060600, 0xffff1f83, sme2_movaz, 0, OP2 (SME_Zdnx4, SME_ZA_array_vrsb_2), OP_SVE_BB, 0, 0),
+  SME2p1_INSN ("movaz", 0xc0460600, 0xffff1f83, sme2_movaz, 0, OP2 (SME_Zdnx4, SME_ZA_array_vrsh_2), OP_SVE_HH, 0, 0),
+  SME2p1_INSN ("movaz", 0xc0860600, 0xffff1f83, sme2_movaz, 0, OP2 (SME_Zdnx4, SME_ZA_array_vrss_2), OP_SVE_SS, 0, 0),
+  SME2p1_INSN ("movaz", 0xc0c60600, 0xffff1f03, sme2_movaz, 0, OP2 (SME_Zdnx4, SME_ZA_array_vrsd_2), OP_SVE_DD, 0, 0),
+
+  SME2p1_INSN ("movaz", 0xc0060200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsb_1), OP_SVE_BB, 0, 0),
+  SME2p1_INSN ("movaz", 0xc0460200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsh_1), OP_SVE_HH, 0, 0),
+  SME2p1_INSN ("movaz", 0xc0860200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrss_1), OP_SVE_SS, 0, 0),
+  SME2p1_INSN ("movaz", 0xc0c60200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsd_1), OP_SVE_DD, 0, 0),
   {0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
 };
 
@@ -6726,6 +6746,22 @@ const struct aarch64_opcode aarch64_opcode_table[] =
     Y(SIMD_REG, regno, "SVE_Vd", 0, F(FLD_SVE_Vd), "a SIMD register")	\
     Y(SIMD_REG, regno, "SVE_Vm", 0, F(FLD_SVE_Vm), "a SIMD register")	\
     Y(SIMD_REG, regno, "SVE_Vn", 0, F(FLD_SVE_Vn), "a SIMD register")	\
+    Y(ZA_ACCESS, sme_za_vrs1, "SME_ZA_array_vrsb_1", 0,			\
+      F(FLD_SME_V,FLD_SME_Rv,FLD_off3), "ZA0 tile")			\
+    Y(ZA_ACCESS, sme_za_vrs1, "SME_ZA_array_vrsh_1", 0,			\
+      F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn_1,FLD_off2), "1 bit ZA tile")	\
+    Y(ZA_ACCESS, sme_za_vrs1, "SME_ZA_array_vrss_1", 0,			\
+      F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn_2,FLD_ol), "2 ZA tile")		\
+    Y(ZA_ACCESS, sme_za_vrs1, "SME_ZA_array_vrsd_1", 0,			\
+      F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn_3), "3 ZA tile")			\
+    Y(ZA_ACCESS, sme_za_vrs2, "SME_ZA_array_vrsb_2", 0,			\
+      F(FLD_SME_V,FLD_SME_Rv,FLD_off2), "ZA0 tile")			\
+    Y(ZA_ACCESS, sme_za_vrs2, "SME_ZA_array_vrsh_2", 0,			\
+      F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn,FLD_ol), "1 bit ZA tile")		\
+    Y(ZA_ACCESS, sme_za_vrs2, "SME_ZA_array_vrss_2", 0,			\
+      F(FLD_SME_V,FLD_SME_Rv,FLD_off2), "2 bit ZA tile")		\
+    Y(ZA_ACCESS, sme_za_vrs2, "SME_ZA_array_vrsd_2", 0,			\
+      F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn_3), "3 bit ZA tile")		\
     Y(SVE_REG, regno, "SVE_Za_5", 0, F(FLD_SVE_Za_5),			\
       "an SVE vector register")						\
     Y(SVE_REG, regno, "SVE_Za_16", 0, F(FLD_SVE_Za_16),			\

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/6][Binutils] aarch64: Add support for FEAT_SVE2p1.
  2024-01-15  9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
@ 2024-01-15  9:35   ` Srinath Parvathaneni
  0 siblings, 0 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15  9:35 UTC (permalink / raw)
  To: binutils; +Cc: richard.earnshaw, nickc

[-- Attachment #1: Type: text/plain, Size: 358 bytes --]

Hi,

This patch add support for FEAT_SVE2p1 (SVE2.1 Extension) feature
along with +sve2p1 optional flag to enabe this feature.

Also support for following SVE2p1 instructions is added
addqv, andqv, smaxqv, sminqv, umaxqv, uminqv and uminqv.

Regression testing for aarch64-none-elf target and found no regressions.

Ok for binutils-master?

Regards,
Srinath.

[-- Attachment #2: 3_6.patch --]
[-- Type: text/x-patch, Size: 15176 bytes --]

diff --git a/gas/NEWS b/gas/NEWS
index d2c5c0641c4392d66472e535e5f51756fc5d511f..6b2282a393b8ae7046381935c5a9263879a7893f 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,8 @@
 -*- text -*-
 
+* Add support for the Arm Scalable Vector Extension version 2.1 (SVE2.1)
+  instructions.
+
 * Add support for the AArch64 Scalable Matrix Extension version 2.1 (SME2.1)
   instructions.
 
diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 34159c2168b78fe12d4a549678ae77be88e50313..04dd08a6fa71b84b3e71e0ab422fb6deb9fedb38 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -10354,6 +10354,7 @@ static const struct aarch64_option_cpu_value_table aarch64_features[] = {
 			AARCH64_FEATURE (LSE128)},
   {"b16b16",		AARCH64_FEATURE (B16B16), AARCH64_FEATURE (SVE2)},
   {"sme2p1",		AARCH64_FEATURE (SME2p1), AARCH64_FEATURE (SME2)},
+  {"sve2p1",		AARCH64_FEATURE (SVE2p1), AARCH64_FEATURE (SVE2)},
   {NULL,		AARCH64_NO_FEATURES, AARCH64_NO_FEATURES},
 };
 
diff --git a/gas/doc/c-aarch64.texi b/gas/doc/c-aarch64.texi
index 1f3a4fbcafbab652d70f202b33527ae3dde7e1bb..7a8da72c24a452b1867656acfd1ee1af3e4b56d5 100644
--- a/gas/doc/c-aarch64.texi
+++ b/gas/doc/c-aarch64.texi
@@ -278,6 +278,8 @@ automatically cause those extensions to be disabled.
  @tab Enable the 128-bit Page Descriptor Extension.  This implies @code{lse128}.
 @item @code{sme2p1} @tab N/A @tab No
  @tab Enable the SME2.1 Extension.
+@item @code{sve2p1} @tab N/A @tab No
+ @tab Enable the SVE2.1 Extension.
 @end multitable
 
 @node AArch64 Syntax
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.d b/gas/testsuite/gas/aarch64/sve2p1-1-bad.d
new file mode 100644
index 0000000000000000000000000000000000000000..a2ca49ef487563a55ae8c26ca4318e68da850e64
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.d
@@ -0,0 +1,4 @@
+#name: Illegal test of SVE2.1 min max instructions.
+#as: -march=armv9.4-a
+#source: sve2p1-1.s
+#error_output: sve2p1-1-bad.l
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
new file mode 100644
index 0000000000000000000000000000000000000000..6b07eee9e94d93a9e8d6357a741d2d6ef90601e0
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
@@ -0,0 +1,37 @@
+.*: Assembler messages:
+.*: Error: selected processor does not support `addqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `addqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `addqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `addqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `addqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `addqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `andqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `andqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `andqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `andqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `andqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `andqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `smaxqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `smaxqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `smaxqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `smaxqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `smaxqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `smaxqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `umaxqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `umaxqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `umaxqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `umaxqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `umaxqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `umaxqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `sminqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `sminqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `sminqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `sminqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `sminqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `sminqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `uminqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `uminqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `uminqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `uminqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `uminqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `uminqv v16.4s,p7,z0.s'
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.d b/gas/testsuite/gas/aarch64/sve2p1-1.d
new file mode 100644
index 0000000000000000000000000000000000000000..d3d14f3c455aa352d31e01195196e03397ed4271
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.d
@@ -0,0 +1,46 @@
+#name: Test of SVE2.1 min max instructions.
+#as: -march=armv9.4-a+sve2p1
+#objdump: -dr
+
+[^:]+:     file format .*
+
+
+[^:]+:
+
+[^:]+:
+.*:	04052200 	addqv	v0.16b, p0, z16.b
+.*:	04452501 	addqv	v1.8h, p1, z8.h
+.*:	04852882 	addqv	v2.4s, p2, z4.s
+.*:	04c52c44 	addqv	v4.2d, p3, z2.d
+.*:	04c53028 	addqv	v8.2d, p4, z1.d
+.*:	04853c10 	addqv	v16.4s, p7, z0.s
+.*:	041e2200 	andqv	v0.16b, p0, z16.b
+.*:	045e2501 	andqv	v1.8h, p1, z8.h
+.*:	049e2882 	andqv	v2.4s, p2, z4.s
+.*:	04de2c44 	andqv	v4.2d, p3, z2.d
+.*:	04de3028 	andqv	v8.2d, p4, z1.d
+.*:	049e3c10 	andqv	v16.4s, p7, z0.s
+.*:	040c2200 	smaxqv	v0.16b, p0, z16.b
+.*:	044c2501 	smaxqv	v1.8h, p1, z8.h
+.*:	048c2882 	smaxqv	v2.4s, p2, z4.s
+.*:	04cc2c44 	smaxqv	v4.2d, p3, z2.d
+.*:	04cc3028 	smaxqv	v8.2d, p4, z1.d
+.*:	048c3c10 	smaxqv	v16.4s, p7, z0.s
+.*:	040d2200 	umaxqv	v0.16b, p0, z16.b
+.*:	044d2501 	umaxqv	v1.8h, p1, z8.h
+.*:	048d2882 	umaxqv	v2.4s, p2, z4.s
+.*:	04cd2c44 	umaxqv	v4.2d, p3, z2.d
+.*:	04cd3028 	umaxqv	v8.2d, p4, z1.d
+.*:	048d3c10 	umaxqv	v16.4s, p7, z0.s
+.*:	040e2200 	sminqv	v0.16b, p0, z16.b
+.*:	044e2501 	sminqv	v1.8h, p1, z8.h
+.*:	048e2882 	sminqv	v2.4s, p2, z4.s
+.*:	04ce2c44 	sminqv	v4.2d, p3, z2.d
+.*:	04ce3028 	sminqv	v8.2d, p4, z1.d
+.*:	048e3c10 	sminqv	v16.4s, p7, z0.s
+.*:	040f2200 	uminqv	v0.16b, p0, z16.b
+.*:	044f2501 	uminqv	v1.8h, p1, z8.h
+.*:	048f2882 	uminqv	v2.4s, p2, z4.s
+.*:	04cf2c44 	uminqv	v4.2d, p3, z2.d
+.*:	04cf3028 	uminqv	v8.2d, p4, z1.d
+.*:	048f3c10 	uminqv	v16.4s, p7, z0.s
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.s b/gas/testsuite/gas/aarch64/sve2p1-1.s
new file mode 100644
index 0000000000000000000000000000000000000000..c56ebf856ab5815efa01e06f40d04360f8afc7bc
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.s
@@ -0,0 +1,41 @@
+addqv v0.16b, p0, z16.b
+addqv v1.8h, p1, z8.h
+addqv v2.4s, p2, z4.s
+addqv v4.2d, p3, z2.d
+addqv v8.2d, p4, z1.d
+addqv v16.4s, p7, z0.s
+
+andqv v0.16b, p0, z16.b
+andqv v1.8h, p1, z8.h
+andqv v2.4s, p2, z4.s
+andqv v4.2d, p3, z2.d
+andqv v8.2d, p4, z1.d
+andqv v16.4s, p7, z0.s
+
+smaxqv v0.16b, p0, z16.b
+smaxqv v1.8h, p1, z8.h
+smaxqv v2.4s, p2, z4.s
+smaxqv v4.2d, p3, z2.d
+smaxqv v8.2d, p4, z1.d
+smaxqv v16.4s, p7, z0.s
+
+umaxqv v0.16b, p0, z16.b
+umaxqv v1.8h, p1, z8.h
+umaxqv v2.4s, p2, z4.s
+umaxqv v4.2d, p3, z2.d
+umaxqv v8.2d, p4, z1.d
+umaxqv v16.4s, p7, z0.s
+
+sminqv v0.16b, p0, z16.b
+sminqv v1.8h, p1, z8.h
+sminqv v2.4s, p2, z4.s
+sminqv v4.2d, p3, z2.d
+sminqv v8.2d, p4, z1.d
+sminqv v16.4s, p7, z0.s
+
+uminqv v0.16b, p0, z16.b
+uminqv v1.8h, p1, z8.h
+uminqv v2.4s, p2, z4.s
+uminqv v4.2d, p3, z2.d
+uminqv v8.2d, p4, z1.d
+uminqv v16.4s, p7, z0.s
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index 648e25f3e4242bb738eee5f62079838784223b8a..1af49c406e06e79ba81a1f01887d43da37d8a625 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -226,6 +226,8 @@ enum aarch64_feature_bit {
   AARCH64_FEATURE_B16B16,
   /* SME2.1 instructions.  */
   AARCH64_FEATURE_SME2p1,
+  /* SVE2.1 instructions.  */
+  AARCH64_FEATURE_SVE2p1,
   AARCH64_NUM_FEATURES
 };
 
@@ -1000,6 +1002,7 @@ enum aarch64_insn_class
   cssc,
   gcs,
   the,
+  sve2_urqvs
 };
 
 /* Opcode enumerators.  */
@@ -1272,7 +1275,9 @@ extern const aarch64_opcode aarch64_opcode_table[];
    allow.  This impacts the constraintts on assembly but yelds no
    impact on disassembly.  */
 #define F_OPD_NARROW (1ULL << 33)
-/* Next bit is 34.  */
+/* For the instruction with size[22:23] field.  */
+#define F_OPD_SIZE (1ULL << 34)
+/* Next bit is 35.  */
 
 /* Instruction constraints.  */
 /* This instruction has a predication constraint on the instruction at PC+4.  */
@@ -1339,7 +1344,8 @@ static inline bool
 opcode_has_special_coder (const aarch64_opcode *opcode)
 {
   return (opcode->flags & (F_SF | F_LSE_SZ | F_SIZEQ | F_FPTYPE | F_SSIZE | F_T
-	  | F_GPRSIZE_IN_Q | F_LDS_SIZE | F_MISC | F_N | F_COND)) != 0;
+	  | F_GPRSIZE_IN_Q | F_LDS_SIZE | F_MISC | F_N | F_COND
+	  | F_OPD_SIZE)) != 0;
 }
 \f
 struct aarch64_name_value_pair
diff --git a/opcodes/aarch64-asm.c b/opcodes/aarch64-asm.c
index 3fac127a5899077e2ac19c5e98df737b8ffbe147..1dfd59df42dbbe5640ece7c83f43f027a8329d06 100644
--- a/opcodes/aarch64-asm.c
+++ b/opcodes/aarch64-asm.c
@@ -1981,6 +1981,20 @@ do_special_encoding (struct aarch64_inst *inst)
       gen_sub_field (FLD_imm5, 0, num + 1, &field);
       insert_field_2 (&field, &inst->value, 1 << num, inst->opcode->mask);
     }
+
+  if ((inst->opcode->flags & F_OPD_SIZE) && inst->opcode->iclass == sve2_urqvs)
+    {
+      enum aarch64_opnd_qualifier qualifier[1];
+      aarch64_insn value1 = 0;
+      idx = 0;
+      qualifier[0] = inst->operands[idx].qualifier;
+      qualifier[1] = inst->operands[idx+2].qualifier;
+      value = aarch64_get_qualifier_standard_value (qualifier[0]);
+      value1 = aarch64_get_qualifier_standard_value (qualifier[1]);
+      assert ((value >> 1) == value1);
+      insert_field (FLD_size, &inst->value, value1, inst->opcode->mask);
+    }
+
   if (inst->opcode->flags & F_GPRSIZE_IN_Q)
     {
       /* Use Rt to encode in the case of e.g.
diff --git a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c
index a14b2ca02d2c0fcf6c94f5bb0c587a7168594a5b..d395438966f16d1fc0fa7117a434cff50901f96e 100644
--- a/opcodes/aarch64-dis.c
+++ b/opcodes/aarch64-dis.c
@@ -2609,6 +2609,16 @@ do_special_decoding (aarch64_inst *inst)
 	get_vreg_qualifier_from_value ((num << 1) | Q);
     }
 
+  if ((inst->opcode->flags & F_OPD_SIZE) && inst->opcode->iclass == sve2_urqvs)
+    {
+      unsigned size;
+      size = (unsigned) extract_field (FLD_size, inst->value,
+				       inst->opcode->mask);
+      inst->operands[0].qualifier
+	= get_vreg_qualifier_from_value (1 + (size << 1));
+      inst->operands[2].qualifier = get_sreg_qualifier_from_value (size);
+    }
+
   if (inst->opcode->flags & F_GPRSIZE_IN_Q)
     {
       /* Use Rt to encode in the case of e.g.
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index 9c7648b0a6df5444cc89f52aef3d455e624eedbb..f433257634e72b6afb64d58a1f0f052164291033 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1487,6 +1487,10 @@
    - P: the operand has a /[ZM] suffix and the choice of suffix is not
      the same for all variants.
 
+   - v: the operand has a V_[16B|8H|4S|2D] qualifier and the choice of
+     qualifier suffix is not the same for all variants.  This is used for
+     the same kinds of operands as [BHSD] above.
+
    The _<sizes>, if present, give the subset of [BHSD] that are accepted
    by the V entries in <operands>.  */
 #define OP_SVE_B                                        \
@@ -1911,6 +1915,13 @@
   QLF3(S_S,S_H,NIL),                                    \
   QLF3(S_D,S_S,NIL),                                    \
 }
+#define OP_SVE_vUS_BHSD_BHSD				\
+{							\
+  QLF3(V_16B,NIL,S_B),					\
+  QLF3(V_8H,NIL,S_H),					\
+  QLF3(V_4S,NIL,S_S),					\
+  QLF3(V_2D,NIL,S_D),					\
+}
 #define OP_SVE_VMV_SD                                   \
 {                                                       \
   QLF3(S_S,P_M,S_S),                                    \
@@ -2620,6 +2631,8 @@ static const aarch64_feature_set aarch64_feature_b16b16 =
   AARCH64_FEATURE (B16B16);
 static const aarch64_feature_set aarch64_feature_sme2p1 =
   AARCH64_FEATURE (SME2p1);
+static const aarch64_feature_set aarch64_feature_sve2p1 =
+  AARCH64_FEATURE (SVE2p1);
 
 #define CORE		&aarch64_feature_v8
 #define FP		&aarch64_feature_fp
@@ -2684,6 +2697,7 @@ static const aarch64_feature_set aarch64_feature_sme2p1 =
 #define D128_THE  &aarch64_feature_d128_the
 #define B16B16  &aarch64_feature_b16b16
 #define SME2p1  &aarch64_feature_sme2p1
+#define SVE2p1  &aarch64_feature_sve2p1
 
 #define CORE_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS) \
   { NAME, OPCODE, MASK, CLASS, OP, CORE, OPS, QUALS, FLAGS, 0, 0, NULL }
@@ -2762,6 +2776,12 @@ static const aarch64_feature_set aarch64_feature_sme2p1 =
 #define B16B16_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
   { NAME, OPCODE, MASK, CLASS, OP, B16B16, OPS, QUALS, \
     FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
+#define SVE2p1_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
+  { NAME, OPCODE, MASK, CLASS, OP, SVE2p1, OPS, QUALS, \
+    FLAGS | F_STRICT, 0, TIED, NULL }
+#define SVE2p1_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
+  { NAME, OPCODE, MASK, CLASS, OP, SVE2p1, OPS, QUALS, \
+    FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
 #define SVE2AES_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
   { NAME, OPCODE, MASK, CLASS, OP, SVE2_AES, OPS, QUALS, \
     FLAGS | F_STRICT, 0, TIED, NULL }
@@ -6309,6 +6329,15 @@ const struct aarch64_opcode aarch64_opcode_table[] =
   SME2p1_INSN ("movaz", 0xc0460200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsh_1), OP_SVE_HH, 0, 0),
   SME2p1_INSN ("movaz", 0xc0860200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrss_1), OP_SVE_SS, 0, 0),
   SME2p1_INSN ("movaz", 0xc0c60200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsd_1), OP_SVE_DD, 0, 0),
+
+/* SVE2p1 Instructions.  */
+  SVE2p1_INSNC("addqv",0x04052000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("andqv",0x041e2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("smaxqv",0x040c2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("sminqv",0x040e2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("umaxqv",0x040d2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("uminqv",0x040f2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+
   {0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
 };
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 4/6][Binutils] aarch64: Add SVE2.1 dupq, eorqv and extq instructions.
  2024-01-15  9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
  2024-01-15  9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
@ 2024-01-15  9:37 ` Srinath Parvathaneni
  2024-01-15  9:38 ` PATCH 5/6][Binutils] aarch64: Add SVE2.1 fmin and fmax instructions Srinath Parvathaneni
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15  9:37 UTC (permalink / raw)
  To: binutils; +Cc: richard.earnshaw, nickc

[-- Attachment #1: Type: text/plain, Size: 190 bytes --]

Hi,

This patch add support for SVE2.1 instruction dupq, eorqv and extq.

Regression testing for aarch64-none-elf target and found no regressions.

Ok for binutils-master?

Regards,
Srinath.

[-- Attachment #2: 4_6.patch --]
[-- Type: text/x-patch, Size: 13581 bytes --]

diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 04dd08a6fa71b84b3e71e0ab422fb6deb9fedb38..0665732fe03cc59df4ebd36ee1afbad08c22b72e 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -6698,6 +6698,8 @@ parse_operands (char *str, const aarch64_opcode *opcode)
 	case AARCH64_OPND_SVE_Zm4_11_INDEX:
 	case AARCH64_OPND_SVE_Zm4_INDEX:
 	case AARCH64_OPND_SVE_Zn_INDEX:
+	case AARCH64_OPND_SVE_Zm_imm4:
+	case AARCH64_OPND_SVE_Zn_5_INDEX:
 	case AARCH64_OPND_SME_Zm_INDEX1:
 	case AARCH64_OPND_SME_Zm_INDEX2:
 	case AARCH64_OPND_SME_Zm_INDEX3_1:
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
index 6b07eee9e94d93a9e8d6357a741d2d6ef90601e0..f5a80d26768882f2b2e16840ad587612d34ae15e 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
@@ -35,3 +35,23 @@
 .*: Error: selected processor does not support `uminqv v4.2d,p3,z2.d'
 .*: Error: selected processor does not support `uminqv v8.2d,p4,z1.d'
 .*: Error: selected processor does not support `uminqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `dupq z10.b,z20.b\[0\]'
+.*: Error: selected processor does not support `dupq z10.b,z20.b\[15\]'
+.*: Error: selected processor does not support `dupq z10.h,z20.h\[0\]'
+.*: Error: selected processor does not support `dupq z10.h,z20.h\[7\]'
+.*: Error: selected processor does not support `dupq z10.s,z20.s\[0\]'
+.*: Error: selected processor does not support `dupq z10.s,z20.s\[3\]'
+.*: Error: selected processor does not support `dupq z10.d,z20.d\[0\]'
+.*: Error: selected processor does not support `dupq z10.d,z20.d\[1\]'
+.*: Error: selected processor does not support `eorqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `eorqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `eorqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `eorqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `eorqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `eorqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `extq z0.b,z0.b,z10.b\[15\]'
+.*: Error: selected processor does not support `extq z1.b,z1.b,z15.b\[7\]'
+.*: Error: selected processor does not support `extq z2.b,z2.b,z5.b\[3\]'
+.*: Error: selected processor does not support `extq z4.b,z4.b,z12.b\[1\]'
+.*: Error: selected processor does not support `extq z8.b,z8.b,z7.b\[4\]'
+.*: Error: selected processor does not support `extq z16.b,z16.b,z1.b\[8\]'
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.d b/gas/testsuite/gas/aarch64/sve2p1-1.d
index d3d14f3c455aa352d31e01195196e03397ed4271..6d718aec7cad5511bda282865d0667a6cdaa188d 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.d
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.d
@@ -44,3 +44,23 @@
 .*:	04cf2c44 	uminqv	v4.2d, p3, z2.d
 .*:	04cf3028 	uminqv	v8.2d, p4, z1.d
 .*:	048f3c10 	uminqv	v16.4s, p7, z0.s
+.*:	0530268a 	dupq	z10.b, z20.b\[0\]
+.*:	053f268a 	dupq	z10.b, z20.b\[15\]
+.*:	0521268a 	dupq	z10.h, z20.h\[0\]
+.*:	052f268a 	dupq	z10.h, z20.h\[7\]
+.*:	0522268a 	dupq	z10.s, z20.s\[0\]
+.*:	052e268a 	dupq	z10.s, z20.s\[3\]
+.*:	0524268a 	dupq	z10.d, z20.d\[0\]
+.*:	052c268a 	dupq	z10.d, z20.d\[1\]
+.*:	041d2200 	eorqv	v0.16b, p0, z16.b
+.*:	045d2501 	eorqv	v1.8h, p1, z8.h
+.*:	049d2882 	eorqv	v2.4s, p2, z4.s
+.*:	04dd2c44 	eorqv	v4.2d, p3, z2.d
+.*:	04dd3028 	eorqv	v8.2d, p4, z1.d
+.*:	049d3c10 	eorqv	v16.4s, p7, z0.s
+.*:	056a27c0 	extq	z0.b, z0.b, z10.b\[15\]
+.*:	056f25c1 	extq	z1.b, z1.b, z15.b\[7\]
+.*:	056524c2 	extq	z2.b, z2.b, z5.b\[3\]
+.*:	056c2444 	extq	z4.b, z4.b, z12.b\[1\]
+.*:	05672508 	extq	z8.b, z8.b, z7.b\[4\]
+.*:	05612610 	extq	z16.b, z16.b, z1.b\[8\]
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.s b/gas/testsuite/gas/aarch64/sve2p1-1.s
index c56ebf856ab5815efa01e06f40d04360f8afc7bc..5278dcf5e67b4cb34ab45b2b2725ab3af14c2594 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.s
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.s
@@ -39,3 +39,25 @@ uminqv v2.4s, p2, z4.s
 uminqv v4.2d, p3, z2.d
 uminqv v8.2d, p4, z1.d
 uminqv v16.4s, p7, z0.s
+dupq z10.b, z20.b[0]
+dupq z10.b, z20.b[15]
+dupq z10.h, z20.h[0]
+dupq z10.h, z20.h[7]
+dupq z10.s, z20.s[0]
+dupq z10.s, z20.s[3]
+dupq z10.d, z20.d[0]
+dupq z10.d, z20.d[1]
+
+eorqv v0.16b, p0, z16.b
+eorqv v1.8h, p1, z8.h
+eorqv v2.4s, p2, z4.s
+eorqv v4.2d, p3, z2.d
+eorqv v8.2d, p4, z1.d
+eorqv v16.4s, p7, z0.s
+
+extq z0.b, z0.b, z10.b[15]
+extq z1.b, z1.b, z15.b[7]
+extq z2.b, z2.b, z5.b[3]
+extq z4.b, z4.b, z12.b[1]
+extq z8.b, z8.b, z7.b[4]
+extq z16.b, z16.b, z1.b[8]
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index 1af49c406e06e79ba81a1f01887d43da37d8a625..de161db75d509b0ac96c604da7bc9743193d23b2 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -727,8 +727,10 @@ enum aarch64_opnd
   AARCH64_OPND_SVE_Zm3_19_INDEX, /* z0-z7[0-3] in Zm3_INDEX plus bit 19.  */
   AARCH64_OPND_SVE_Zm3_22_INDEX, /* z0-z7[0-7] in Zm3_INDEX plus bit 22.  */
   AARCH64_OPND_SVE_Zm4_11_INDEX, /* z0-z15[0-3] in Zm plus bit 11.  */
+  AARCH64_OPND_SVE_Zm_imm4,     /* SVE vector register with 4bit index.  */
   AARCH64_OPND_SVE_Zm4_INDEX,	/* z0-z15[0-1] in Zm, bits [20,16].  */
   AARCH64_OPND_SVE_Zn,		/* SVE vector register in Zn.  */
+  AARCH64_OPND_SVE_Zn_5_INDEX,	/* Indexed SVE vector register, for DUPQ.  */
   AARCH64_OPND_SVE_Zn_INDEX,	/* Indexed SVE vector register, for DUP.  */
   AARCH64_OPND_SVE_ZnxN,	/* SVE vector register list in Zn.  */
   AARCH64_OPND_SVE_Zt,		/* SVE vector register in Zt.  */
@@ -1002,7 +1004,8 @@ enum aarch64_insn_class
   cssc,
   gcs,
   the,
-  sve2_urqvs
+  sve2_urqvs,
+  sve_index1,
 };
 
 /* Opcode enumerators.  */
diff --git a/opcodes/aarch64-asm.h b/opcodes/aarch64-asm.h
index d4b6407dc5de8d6e103ee8ca5b5f2c6bb814647f..e48bf0db8a86149155e325923f6644bb90410ccb 100644
--- a/opcodes/aarch64-asm.h
+++ b/opcodes/aarch64-asm.h
@@ -93,6 +93,7 @@ AARCH64_DECL_OPD_INSERTER (ins_sve_float_half_one);
 AARCH64_DECL_OPD_INSERTER (ins_sve_float_half_two);
 AARCH64_DECL_OPD_INSERTER (ins_sve_float_zero_one);
 AARCH64_DECL_OPD_INSERTER (ins_sve_index);
+AARCH64_DECL_OPD_INSERTER (ins_sve_index_imm);
 AARCH64_DECL_OPD_INSERTER (ins_sve_limm_mov);
 AARCH64_DECL_OPD_INSERTER (ins_sve_quad_index);
 AARCH64_DECL_OPD_INSERTER (ins_sve_reglist);
diff --git a/opcodes/aarch64-asm.c b/opcodes/aarch64-asm.c
index 1dfd59df42dbbe5640ece7c83f43f027a8329d06..0de09f0435abc3b7761707b2c81c58f5f3b1a10e 100644
--- a/opcodes/aarch64-asm.c
+++ b/opcodes/aarch64-asm.c
@@ -1220,6 +1220,21 @@ aarch64_ins_sve_index (const aarch64_operand *self,
   return true;
 }
 
+/* Encode Zn.<T>[<imm>], where <imm> is an immediate with range of 0 to one less
+   than the number of elements in 128 bit, which can encode il:tsz.  */
+bool
+aarch64_ins_sve_index_imm (const aarch64_operand *self,
+			   const aarch64_opnd_info *info, aarch64_insn *code,
+			   const aarch64_inst *inst ATTRIBUTE_UNUSED,
+			   aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+  insert_field (self->fields[0], code, info->reglane.regno, 0);
+  unsigned int esize = aarch64_get_qualifier_esize (info->qualifier);
+  insert_fields (code, (info->reglane.index * 2 + 1) * esize, 0,
+		 2, self->fields[1],self->fields[2]);
+  return true;
+}
+
 /* Encode a logical/bitmask immediate for the MOV alias of SVE DUPM.  */
 bool
 aarch64_ins_sve_limm_mov (const aarch64_operand *self,
@@ -2079,6 +2094,7 @@ aarch64_encode_variant_using_iclass (struct aarch64_inst *inst)
 
     case sme_shift:
     case sve_index:
+    case sve_index1:
     case sve_shift_pred:
     case sve_shift_unpred:
     case sve_shift_tsz_hsd:
diff --git a/opcodes/aarch64-dis.h b/opcodes/aarch64-dis.h
index 9a38c1ab50f7fdb27588c7451ade19c166e69c96..30212f2ae2c2759b5667e5a007912d22c4a702fc 100644
--- a/opcodes/aarch64-dis.h
+++ b/opcodes/aarch64-dis.h
@@ -117,6 +117,7 @@ AARCH64_DECL_OPD_EXTRACTOR (ext_sve_float_half_one);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sve_float_half_two);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sve_float_zero_one);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sve_index);
+AARCH64_DECL_OPD_EXTRACTOR (ext_sve_index_imm);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sve_limm_mov);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sve_quad_index);
 AARCH64_DECL_OPD_EXTRACTOR (ext_sve_reglist);
diff --git a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c
index d395438966f16d1fc0fa7117a434cff50901f96e..bffa760004a3ede5c14287ee4db09d6db371bc87 100644
--- a/opcodes/aarch64-dis.c
+++ b/opcodes/aarch64-dis.c
@@ -2097,6 +2097,26 @@ aarch64_ext_sve_index (const aarch64_operand *self,
   return true;
 }
 
+/* Decode Zn.<T>[<imm>], where <imm> is an immediate with range of 0 to one less
+   than the number of elements in 128 bit, which can encode il:tsz.  */
+bool
+aarch64_ext_sve_index_imm (const aarch64_operand *self,
+			   aarch64_opnd_info *info, aarch64_insn code,
+			   const aarch64_inst *inst ATTRIBUTE_UNUSED,
+			   aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+  int val;
+
+  info->reglane.regno = extract_field (self->fields[0], code, 0);
+  val = extract_fields (code, 0, 2, self->fields[2], self->fields[1]);
+  if ((val & 15) == 0)
+    return 0;
+  while ((val & 1) == 0)
+    val /= 2;
+  info->reglane.index = val / 2;
+  return true;
+}
+
 /* Decode a logical immediate for the MOV alias of SVE DUPM.  */
 bool
 aarch64_ext_sve_limm_mov (const aarch64_operand *self,
@@ -3231,6 +3251,17 @@ aarch64_decode_variant_using_iclass (aarch64_inst *inst)
 	}
       break;
 
+    case sve_index1:
+      i = extract_fields (inst->value, 0, 2, FLD_SVE_tsz, FLD_SVE_i2h);
+      if ((i & 15) == 0)
+	return false;
+      while ((i & 1) == 0)
+	{
+	  i >>= 1;
+	  variant += 1;
+	}
+      break;
+
     case sve_limm:
       /* Pick the smallest applicable element size.  */
       if ((inst->value & 0x20600) == 0x600)
diff --git a/opcodes/aarch64-opc.c b/opcodes/aarch64-opc.c
index cf76871930f9f4e8613a977efb81464dce3d8ba7..1d8ed26c7090e4b73489b15e74a911e33b54555c 100644
--- a/opcodes/aarch64-opc.c
+++ b/opcodes/aarch64-opc.c
@@ -1794,6 +1794,18 @@ operand_general_constraint_met_p (const aarch64_opnd_info *opnds, int idx,
 	    return 0;
 	  break;
 
+	case AARCH64_OPND_SVE_Zm_imm4:
+	  if (!check_reglane (opnd, mismatch_detail, idx, "z", 0, 31, 0, 15))
+	    return 0;
+	  break;
+
+	case AARCH64_OPND_SVE_Zn_5_INDEX:
+	  size = aarch64_get_qualifier_esize (opnd->qualifier);
+	  if (!check_reglane (opnd, mismatch_detail, idx, "z", 0, 31,
+			      0, 16 / size - 1))
+	    return 0;
+	  break;
+
 	case AARCH64_OPND_SME_PNn3_INDEX1:
 	case AARCH64_OPND_SME_PNn3_INDEX2:
 	  size = get_operand_field_width (get_operand_from_code (type), 1);
@@ -4074,6 +4086,7 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
     case AARCH64_OPND_SME_Zm_INDEX3_1:
     case AARCH64_OPND_SME_Zm_INDEX3_2:
     case AARCH64_OPND_SME_Zm_INDEX3_10:
+    case AARCH64_OPND_SVE_Zn_5_INDEX:
     case AARCH64_OPND_SME_Zm_INDEX4_1:
     case AARCH64_OPND_SME_Zm_INDEX4_10:
     case AARCH64_OPND_SME_Zn_INDEX1_16:
@@ -4082,6 +4095,7 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
     case AARCH64_OPND_SME_Zn_INDEX3_14:
     case AARCH64_OPND_SME_Zn_INDEX3_15:
     case AARCH64_OPND_SME_Zn_INDEX4_14:
+    case AARCH64_OPND_SVE_Zm_imm4:
       snprintf (buf, size, "%s[%s]",
 		(opnd->qualifier == AARCH64_OPND_QLF_NIL
 		 ? style_reg (styler, "z%d", opnd->reglane.regno)
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index f433257634e72b6afb64d58a1f0f052164291033..07f4eb319e9be1a8150224b59aba1ab831e51b29 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -6337,6 +6337,10 @@ const struct aarch64_opcode aarch64_opcode_table[] =
   SVE2p1_INSNC("sminqv",0x040e2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
   SVE2p1_INSNC("umaxqv",0x040d2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
   SVE2p1_INSNC("uminqv",0x040f2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("eorqv",0x041d2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+
+  SVE2p1_INSN("dupq",0x05202400, 0xffe0fc00, sve_index1, 0, OP2 (SVE_Zd, SVE_Zn_5_INDEX), OP_SVE_VV_BHSD, 0, 0),
+  SVE2p1_INSN("extq",0x05602400, 0xfff0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zd, SVE_Zm_imm4), OP_SVE_BBB, 0, 0),
 
   {0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
 };
@@ -6816,11 +6820,17 @@ const struct aarch64_opcode aarch64_opcode_table[] =
     Y(SVE_REG, sve_quad_index, "SVE_Zm4_11_INDEX", 			\
       4 << OPD_F_OD_LSB, F(FLD_SVE_i2h, FLD_SVE_i3l, FLD_SVE_imm4),     \
       "an indexed SVE vector register")					\
+    Y(SVE_REG, sve_quad_index, "SVE_Zm_imm4",				\
+      5 << OPD_F_OD_LSB, F(FLD_SVE_Zm_5, FLD_SVE_imm4),			\
+      "an 4bit indexed SVE vector register")				\
     Y(SVE_REG, sve_quad_index, "SVE_Zm4_INDEX", 			\
       4 << OPD_F_OD_LSB, F(FLD_SVE_Zm_16),				\
       "an indexed SVE vector register")					\
     Y(SVE_REG, regno, "SVE_Zn", 0, F(FLD_SVE_Zn),			\
       "an SVE vector register")						\
+    Y(SVE_REG, sve_index_imm, "SVE_Zn_5_INDEX", 0,			\
+      F(FLD_SVE_Zn, FLD_SVE_i2h, FLD_SVE_tsz),				\
+      "a 5 bit idexed SVE vector register")				\
     Y(SVE_REG, sve_index, "SVE_Zn_INDEX", 0, F(FLD_SVE_Zn),		\
       "an indexed SVE vector register")					\
     Y(SVE_REGLIST, sve_reglist, "SVE_ZnxN", 0, F(FLD_SVE_Zn),		\

^ permalink raw reply	[flat|nested] 7+ messages in thread

* PATCH 5/6][Binutils] aarch64: Add SVE2.1 fmin and fmax instructions.
  2024-01-15  9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
  2024-01-15  9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
  2024-01-15  9:37 ` [PATCH 4/6][Binutils] aarch64: Add SVE2.1 dupq, eorqv and extq instructions Srinath Parvathaneni
@ 2024-01-15  9:38 ` Srinath Parvathaneni
  2024-01-15  9:40 ` [PATCH 6/6][Binutils] aarch64: Add SVE2.1 Contiguous load/store instructions Srinath Parvathaneni
  2024-01-15 11:46 ` [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Nick Clifton
  4 siblings, 0 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15  9:38 UTC (permalink / raw)
  To: binutils; +Cc: richard.earnshaw, nickc

[-- Attachment #1: Type: text/plain, Size: 215 bytes --]

Hi,

This patch add support for SVE2.1 instruction faddqv,
fmaxnmqv, fmaxqv, fminnmqv and fminqv.

Regression testing for aarch64-none-elf target and found no regressions.

Ok for binutils-master?

Regards,
Srinath.

[-- Attachment #2: 5_6.patch --]
[-- Type: text/x-patch, Size: 6680 bytes --]

diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
index f5a80d26768882f2b2e16840ad587612d34ae15e..08aef46de61a6cbbe88ebac77da03ee97c9ebe7c 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
@@ -55,3 +55,28 @@
 .*: Error: selected processor does not support `extq z4.b,z4.b,z12.b\[1\]'
 .*: Error: selected processor does not support `extq z8.b,z8.b,z7.b\[4\]'
 .*: Error: selected processor does not support `extq z16.b,z16.b,z1.b\[8\]'
+.*: Error: selected processor does not support `faddqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `faddqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `faddqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `faddqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `faddqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `fmaxnmqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `fmaxnmqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `fmaxnmqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `fmaxnmqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `fmaxnmqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `fmaxqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `fmaxqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `fmaxqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `fmaxqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `fmaxqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `fminnmqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `fminnmqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `fminnmqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `fminnmqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `fminnmqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `fminqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `fminqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `fminqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `fminqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `fminqv v16.4s,p7,z0.s'
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.d b/gas/testsuite/gas/aarch64/sve2p1-1.d
index 6d718aec7cad5511bda282865d0667a6cdaa188d..437ce9789834683963910141c1468ad46b273ded 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.d
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.d
@@ -64,3 +64,28 @@
 .*:	056c2444 	extq	z4.b, z4.b, z12.b\[1\]
 .*:	05672508 	extq	z8.b, z8.b, z7.b\[4\]
 .*:	05612610 	extq	z16.b, z16.b, z1.b\[8\]
+.*:	6450a501 	faddqv	v1.8h, p1, z8.h
+.*:	6490a882 	faddqv	v2.4s, p2, z4.s
+.*:	64d0ac44 	faddqv	v4.2d, p3, z2.d
+.*:	64d0b028 	faddqv	v8.2d, p4, z1.d
+.*:	6490bc10 	faddqv	v16.4s, p7, z0.s
+.*:	6454a501 	fmaxnmqv	v1.8h, p1, z8.h
+.*:	6494a882 	fmaxnmqv	v2.4s, p2, z4.s
+.*:	64d4ac44 	fmaxnmqv	v4.2d, p3, z2.d
+.*:	64d4b028 	fmaxnmqv	v8.2d, p4, z1.d
+.*:	6494bc10 	fmaxnmqv	v16.4s, p7, z0.s
+.*:	6456a501 	fmaxqv	v1.8h, p1, z8.h
+.*:	6496a882 	fmaxqv	v2.4s, p2, z4.s
+.*:	64d6ac44 	fmaxqv	v4.2d, p3, z2.d
+.*:	64d6b028 	fmaxqv	v8.2d, p4, z1.d
+.*:	6496bc10 	fmaxqv	v16.4s, p7, z0.s
+.*:	6455a501 	fminnmqv	v1.8h, p1, z8.h
+.*:	6495a882 	fminnmqv	v2.4s, p2, z4.s
+.*:	64d5ac44 	fminnmqv	v4.2d, p3, z2.d
+.*:	64d5b028 	fminnmqv	v8.2d, p4, z1.d
+.*:	6495bc10 	fminnmqv	v16.4s, p7, z0.s
+.*:	6457a501 	fminqv	v1.8h, p1, z8.h
+.*:	6497a882 	fminqv	v2.4s, p2, z4.s
+.*:	64d7ac44 	fminqv	v4.2d, p3, z2.d
+.*:	64d7b028 	fminqv	v8.2d, p4, z1.d
+.*:	6497bc10 	fminqv	v16.4s, p7, z0.s
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.s b/gas/testsuite/gas/aarch64/sve2p1-1.s
index 5278dcf5e67b4cb34ab45b2b2725ab3af14c2594..b4908b2be38d927bb61a38e5aba681837d8417e1 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.s
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.s
@@ -61,3 +61,32 @@ extq z2.b, z2.b, z5.b[3]
 extq z4.b, z4.b, z12.b[1]
 extq z8.b, z8.b, z7.b[4]
 extq z16.b, z16.b, z1.b[8]
+faddqv v1.8h, p1, z8.h
+faddqv v2.4s, p2, z4.s
+faddqv v4.2d, p3, z2.d
+faddqv v8.2d, p4, z1.d
+faddqv v16.4s, p7, z0.s
+
+fmaxnmqv v1.8h, p1, z8.h
+fmaxnmqv v2.4s, p2, z4.s
+fmaxnmqv v4.2d, p3, z2.d
+fmaxnmqv v8.2d, p4, z1.d
+fmaxnmqv v16.4s, p7, z0.s
+
+fmaxqv v1.8h, p1, z8.h
+fmaxqv v2.4s, p2, z4.s
+fmaxqv v4.2d, p3, z2.d
+fmaxqv v8.2d, p4, z1.d
+fmaxqv v16.4s, p7, z0.s
+
+fminnmqv v1.8h, p1, z8.h
+fminnmqv v2.4s, p2, z4.s
+fminnmqv v4.2d, p3, z2.d
+fminnmqv v8.2d, p4, z1.d
+fminnmqv v16.4s, p7, z0.s
+
+fminqv v1.8h, p1, z8.h
+fminqv v2.4s, p2, z4.s
+fminqv v4.2d, p3, z2.d
+fminqv v8.2d, p4, z1.d
+fminqv v16.4s, p7, z0.s
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index 07f4eb319e9be1a8150224b59aba1ab831e51b29..f01ca2abf59a9e6c99f3e742e9db8f46bb1c2a5e 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1922,6 +1922,12 @@
   QLF3(V_4S,NIL,S_S),					\
   QLF3(V_2D,NIL,S_D),					\
 }
+#define OP_SVE_vUS_HSD_HSD				\
+{							\
+  QLF3(V_8H,NIL,S_H),					\
+  QLF3(V_4S,NIL,S_S),					\
+  QLF3(V_2D,NIL,S_D),					\
+}
 #define OP_SVE_VMV_SD                                   \
 {                                                       \
   QLF3(S_S,P_M,S_S),                                    \
@@ -6339,6 +6345,12 @@ const struct aarch64_opcode aarch64_opcode_table[] =
   SVE2p1_INSNC("uminqv",0x040f2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
   SVE2p1_INSNC("eorqv",0x041d2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
 
+  SVE2p1_INSNC("faddqv",0x6410a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("fmaxnmqv",0x6414a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("fmaxqv",0x6416a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("fminnmqv",0x6415a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("fminqv",0x6417a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+
   SVE2p1_INSN("dupq",0x05202400, 0xffe0fc00, sve_index1, 0, OP2 (SVE_Zd, SVE_Zn_5_INDEX), OP_SVE_VV_BHSD, 0, 0),
   SVE2p1_INSN("extq",0x05602400, 0xfff0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zd, SVE_Zm_imm4), OP_SVE_BBB, 0, 0),
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 6/6][Binutils] aarch64: Add SVE2.1 Contiguous load/store instructions.
  2024-01-15  9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
                   ` (2 preceding siblings ...)
  2024-01-15  9:38 ` PATCH 5/6][Binutils] aarch64: Add SVE2.1 fmin and fmax instructions Srinath Parvathaneni
@ 2024-01-15  9:40 ` Srinath Parvathaneni
  2024-01-15 11:46 ` [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Nick Clifton
  4 siblings, 0 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15  9:40 UTC (permalink / raw)
  To: binutils; +Cc: richard.earnshaw, nickc

[-- Attachment #1: Type: text/plain, Size: 223 bytes --]

Hi,

This patch add support for SVE2.1 instructions ld1q,
ld2q, ld3q and ld4q, st1q, st2q, st3q and st4q.

Regression testing for aarch64-none-elf target and found no regressions.

Ok for binutils-master?

Regards,
Srinath.

[-- Attachment #2: 6_6.patch --]
[-- Type: text/x-patch, Size: 12646 bytes --]

diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 0665732fe03cc59df4ebd36ee1afbad08c22b72e..5eff6a754adea9c44432e3faacf31d20c4f6fb98 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -6749,6 +6749,9 @@ parse_operands (char *str, const aarch64_opcode *opcode)
 	case AARCH64_OPND_SVE_ZtxN:
 	case AARCH64_OPND_SME_Zdnx2:
 	case AARCH64_OPND_SME_Zdnx4:
+	case AARCH64_OPND_SME_Zt2:
+	case AARCH64_OPND_SME_Zt3:
+	case AARCH64_OPND_SME_Zt4:
 	case AARCH64_OPND_SME_Zmx2:
 	case AARCH64_OPND_SME_Zmx4:
 	case AARCH64_OPND_SME_Znx2:
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
index 08aef46de61a6cbbe88ebac77da03ee97c9ebe7c..50a4bacc73c20324ae50b8688dd8cf5123a238ae 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
@@ -80,3 +80,17 @@
 .*: Error: selected processor does not support `fminqv v4.2d,p3,z2.d'
 .*: Error: selected processor does not support `fminqv v8.2d,p4,z1.d'
 .*: Error: selected processor does not support `fminqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `ld1q Z0.Q,p4/Z,\[Z16.D,x0\]'
+.*: Error: selected processor does not support `ld2q {Z0.Q,Z1.Q},p4/Z,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `ld3q {Z0.Q,Z1.Q,Z2.Q},p4/Z,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `ld4q {Z0.Q,Z1.Q,Z2.Q,Z3.Q},p4/Z,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `ld2q {Z0.Q,Z1.Q},p4/Z,\[x0,x2,lsl#4\]'
+.*: Error: selected processor does not support `ld3q {Z0.Q,Z1.Q,Z2.Q},p4/Z,\[x0,x4,lsl#4\]'
+.*: Error: selected processor does not support `ld4q {Z0.Q,Z1.Q,Z2.Q,Z3.Q},p4/Z,\[x0,x6,lsl#4\]'
+.*: Error: selected processor does not support `st1q Z0.Q,p4,\[Z16.D,x0\]'
+.*: Error: selected processor does not support `st2q {Z0.Q,Z1.Q},p4,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `st3q {Z0.Q,Z1.Q,Z2.Q},p4,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `st4q {Z0.Q,Z1.Q,Z2.Q,Z3.Q},p4,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `st2q {Z0.Q,Z1.Q},p4,\[x0,x2,lsl#4\]'
+.*: Error: selected processor does not support `st3q {Z0.Q,Z1.Q,Z2.Q},p4,\[x0,x4,lsl#4\]'
+.*: Error: selected processor does not support `st4q {Z0.Q,Z1.Q,Z2.Q,Z3.Q},p4,\[x0,x6,lsl#4\]'
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.d b/gas/testsuite/gas/aarch64/sve2p1-1.d
index 437ce9789834683963910141c1468ad46b273ded..daece899b38bba4daa2ca9e58dba2d551f6cf988 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.d
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.d
@@ -89,3 +89,17 @@
 .*:	64d7ac44 	fminqv	v4.2d, p3, z2.d
 .*:	64d7b028 	fminqv	v8.2d, p4, z1.d
 .*:	6497bc10 	fminqv	v16.4s, p7, z0.s
+.*:	c400b200 	ld1q	z0.q, p4/z, \[z16.d, x0\]
+.*:	a49ef000 	ld2q	{z0.q, z1.q}, p4/z, \[x0, #-4, mul vl\]
+.*:	a51ef000 	ld3q	{z0.q, z1.q, z2.q}, p4/z, \[x0, #-4, mul vl\]
+.*:	a59ef000 	ld4q	{z0.q, z1.q, z2.q, z3.q}, p4/z, \[x0, #-4, mul vl\]
+.*:	a4a2f000 	ld2h	{z0.h-z1.h}, p4/z, \[x0, #4, mul vl\]
+.*:	a5249000 	ld3q	{z0.q, z1.q, z2.q}, p4/z, \[x0, x4, lsl #4\]
+.*:	a5a69000 	ld4q	{z0.q, z1.q, z2.q, z3.q}, p4/z, \[x0, x6, lsl #4\]
+.*:	e4203200 	st1q	z0.q, p4, \[z16.d, x0\]
+.*:	e44e1000 	st2q	{z0.q, z1.q}, p4, \[x0, #-4, mul vl\]
+.*:	e48e1000 	st3q	{z0.q, z1.q, z2.q}, p4, \[x0, #-4, mul vl\]
+.*:	e4ce1000 	st4q	{z0.q, z1.q, z2.q, z3.q}, p4, \[x0, #-4, mul vl\]
+.*:	e4621000 	st2q	{z0.q, z1.q}, p4, \[x0, x2, lsl #4\]
+.*:	e4a41000 	st3q	{z0.q, z1.q, z2.q}, p4, \[x0, x4, lsl #4\]
+.*:	e4e61000 	st4q	{z0.q, z1.q, z2.q, z3.q}, p4, \[x0, x6, lsl #4\]
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.s b/gas/testsuite/gas/aarch64/sve2p1-1.s
index b4908b2be38d927bb61a38e5aba681837d8417e1..2a1c7c107d757ae922cec5566adbace1f03e0dce 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.s
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.s
@@ -90,3 +90,18 @@ fminqv v2.4s, p2, z4.s
 fminqv v4.2d, p3, z2.d
 fminqv v8.2d, p4, z1.d
 fminqv v16.4s, p7, z0.s
+ld1q Z0.Q, p4/Z, [Z16.D, x0]
+ld2q {Z0.Q, Z1.Q}, p4/Z, [x0,  #-4, MUL VL]
+ld3q {Z0.Q, Z1.Q, Z2.Q}, p4/Z, [x0,  #-4, MUL VL]
+ld4q {Z0.Q, Z1.Q, Z2.Q, Z3.Q}, p4/Z, [x0,  #-4, MUL VL]
+ld2q {Z0.Q, Z1.Q}, p4/Z, [x0, x2, lsl  #4]
+ld3q {Z0.Q, Z1.Q, Z2.Q}, p4/Z, [x0, x4, lsl  #4]
+ld4q {Z0.Q, Z1.Q, Z2.Q, Z3.Q}, p4/Z, [x0, x6, lsl  #4]
+
+st1q Z0.Q, p4, [Z16.D, x0]
+st2q {Z0.Q, Z1.Q}, p4, [x0,  #-4, MUL VL]
+st3q {Z0.Q, Z1.Q, Z2.Q}, p4, [x0,  #-4, MUL VL]
+st4q {Z0.Q, Z1.Q, Z2.Q, Z3.Q}, p4, [x0,  #-4, MUL VL]
+st2q {Z0.Q, Z1.Q}, p4, [x0, x2, lsl  #4]
+st3q {Z0.Q, Z1.Q, Z2.Q}, p4, [x0, x4, lsl  #4]
+st4q {Z0.Q, Z1.Q, Z2.Q, Z3.Q}, p4, [x0, x6, lsl  #4]
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index de161db75d509b0ac96c604da7bc9743193d23b2..189bab5a92bcacb1ece30752817f666a34f5d81d 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -797,6 +797,9 @@ enum aarch64_opnd
   AARCH64_OPND_MOPS_WB_Rn,	/* Rn!, in bits [5, 9].  */
   AARCH64_OPND_CSSC_SIMM8,	/* CSSC signed 8-bit immediate.  */
   AARCH64_OPND_CSSC_UIMM8,	/* CSSC unsigned 8-bit immediate.  */
+  AARCH64_OPND_SME_Zt2,		/* Qobule SVE vector register list.  */
+  AARCH64_OPND_SME_Zt3,		/* Trible SVE vector register list.  */
+  AARCH64_OPND_SME_Zt4,		/* Quad SVE vector register list.  */
 };
 
 /* Qualifier constrains an operand.  It either specifies a variant of an
diff --git a/opcodes/aarch64-dis.h b/opcodes/aarch64-dis.h
index 30212f2ae2c2759b5667e5a007912d22c4a702fc..48bebfea1e146e71d5fcae67c6558a35fe198e3f 100644
--- a/opcodes/aarch64-dis.h
+++ b/opcodes/aarch64-dis.h
@@ -139,6 +139,7 @@ AARCH64_DECL_OPD_EXTRACTOR (ext_imm_rotate2);
 AARCH64_DECL_OPD_EXTRACTOR (ext_x0_to_x30);
 AARCH64_DECL_OPD_EXTRACTOR (ext_simple_index);
 AARCH64_DECL_OPD_EXTRACTOR (ext_plain_shrimm);
+AARCH64_DECL_OPD_EXTRACTOR (ext_sve_reglist_zt);
 
 #undef AARCH64_DECL_OPD_EXTRACTOR
 
diff --git a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c
index 1381e7524402a867cee23becbaa693d1b293c28d..9e96ba35ed45a404426467b897e379ba44e7e51a 100644
--- a/opcodes/aarch64-dis.c
+++ b/opcodes/aarch64-dis.c
@@ -2160,6 +2160,21 @@ aarch64_ext_sve_reglist (const aarch64_operand *self,
   return true;
 }
 
+/* Decode {Zn.<T> , Zm.<T>}.  The fields array specifies which field
+   to use for Zn.  The opcode-dependent value specifies the number
+   of registers in the list.  */
+bool
+aarch64_ext_sve_reglist_zt (const aarch64_operand *self,
+			    aarch64_opnd_info *info, aarch64_insn code,
+			    const aarch64_inst *inst ATTRIBUTE_UNUSED,
+			    aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+  info->reglist.first_regno = extract_field (self->fields[0], code, 0);
+  info->reglist.num_regs = get_operand_specific_data (self);
+  info->reglist.stride = 1;
+  return true;
+}
+
 /* Decode a strided register list.  The first field holds the top bit
    (0 or 16) and the second field holds the lower bits.  The stride is
    16 divided by the list length.  */
diff --git a/opcodes/aarch64-opc.c b/opcodes/aarch64-opc.c
index 1d8ed26c7090e4b73489b15e74a911e33b54555c..13cd2bcd8a7a79508c340bcf618af61b622bc0fe 100644
--- a/opcodes/aarch64-opc.c
+++ b/opcodes/aarch64-opc.c
@@ -1870,6 +1870,9 @@ operand_general_constraint_met_p (const aarch64_opnd_info *opnds, int idx,
 	case AARCH64_OPND_SME_Zmx4:
 	case AARCH64_OPND_SME_Znx2:
 	case AARCH64_OPND_SME_Znx4:
+	case AARCH64_OPND_SME_Zt2:
+	case AARCH64_OPND_SME_Zt3:
+	case AARCH64_OPND_SME_Zt4:
 	  num = get_operand_specific_data (&aarch64_operands[type]);
 	  if (!check_reglist (opnd, mismatch_detail, idx, num, 1))
 	    return 0;
@@ -3626,7 +3629,10 @@ print_register_list (char *buf, size_t size, const aarch64_opnd_info *opnd,
   /* The hyphenated form is preferred for disassembly if there are
      more than two registers in the list, and the register numbers
      are monotonically increasing in increments of one.  */
-  if (stride == 1 && num_regs > 1)
+  if (stride == 1 && num_regs > 1
+      && ((opnd->type != AARCH64_OPND_SME_Zt2)
+	  && (opnd->type != AARCH64_OPND_SME_Zt3)
+	  && (opnd->type != AARCH64_OPND_SME_Zt4)))
     snprintf (buf, size, "{%s-%s}%s",
 	      style_reg (styler, "%s%d.%s", prefix, first_reg, qlf_name),
 	      style_reg (styler, "%s%d.%s", prefix, last_reg, qlf_name), tb);
@@ -4071,6 +4077,9 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
     case AARCH64_OPND_SME_Znx4:
     case AARCH64_OPND_SME_Ztx2_STRIDED:
     case AARCH64_OPND_SME_Ztx4_STRIDED:
+    case AARCH64_OPND_SME_Zt2:
+    case AARCH64_OPND_SME_Zt3:
+    case AARCH64_OPND_SME_Zt4:
       print_register_list (buf, size, opnd, "z", styler);
       break;
 
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index 383091ef199310b21a0741527eca50bb4a10e668..c5c5c612e508b29ab99d60e0fae20d2c8fcccde4 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1781,6 +1781,14 @@
 {                                                       \
   QLF3(S_S,P_Z,S_S),                                    \
 }
+#define OP_SVE_SZS_QD                                   \
+{                                                       \
+  QLF3(S_Q,P_Z,S_D),                                    \
+}
+#define OP_SVE_SUS_QD                                   \
+{                                                       \
+  QLF3(S_Q,NIL,S_D),                                    \
+}
 #define OP_SVE_SBB                                      \
 {                                                       \
   QLF3(S_S,S_B,S_B),                                    \
@@ -6353,6 +6361,21 @@ const struct aarch64_opcode aarch64_opcode_table[] =
 
   SVE2p1_INSN("dupq",0x05202400, 0xffe0fc00, sve_index1, 0, OP2 (SVE_Zd, SVE_Zn_5_INDEX), OP_SVE_VV_BHSD, 0, 0),
   SVE2p1_INSN("extq",0x05602400, 0xfff0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zd, SVE_Zm_imm4), OP_SVE_BBB, 0, 0),
+  SVE2p1_INSNC("ld1q",0xc400a000, 0xffe0e000, sve_misc, 0, OP3 (SVE_Zt, SVE_Pg3, SVE_ADDR_ZX), OP_SVE_SZS_QD, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("ld2q",0xa490e000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt2, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("ld3q",0xa510e000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt3, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("ld4q",0xa590e000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt4, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("ld2q",0xa4a0e000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt2, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("ld3q",0xa5208000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt3, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("ld4q",0xa5a08000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt4, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+
+  SVE2p1_INSNC("st1q",0xe4202000, 0xffe0e000, sve_misc, 0, OP3 (SVE_Zt, SVE_Pg3, SVE_ADDR_ZX), OP_SVE_SUS_QD, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("st2q",0xe4400000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt2, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("st3q",0xe4800000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt3, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("st4q",0xe4c00000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt4, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("st2q",0xe4600000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt2, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("st3q",0xe4a00000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt3, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+  SVE2p1_INSNC("st4q",0xe4e00000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt4, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
 
   {0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
 };
@@ -6989,4 +7012,13 @@ const struct aarch64_opcode aarch64_opcode_table[] =
     Y(IMMEDIATE, imm, "CSSC_SIMM8", OPD_F_SEXT, F(FLD_CSSC_imm8),	\
       "an 8-bit signed immediate")					\
     Y(IMMEDIATE, imm, "CSSC_UIMM8", 0, F(FLD_CSSC_imm8),		\
-      "an 8-bit unsigned immediate")
+      "an 8-bit unsigned immediate")					\
+    X(SVE_REGLIST, ins_sve_reglist, ext_sve_reglist_zt, "SME_Zt2",	\
+      2 << OPD_F_OD_LSB, F(FLD_SVE_Zt),					\
+      "a list of 2 SVE vector registers")				\
+    X(SVE_REGLIST, ins_sve_reglist, ext_sve_reglist_zt, "SME_Zt3",	\
+      3 << OPD_F_OD_LSB, F(FLD_SVE_Zt),					\
+      "a list of 3 SVE vector registers")				\
+    X(SVE_REGLIST, ins_sve_reglist, ext_sve_reglist_zt, "SME_Zt4",	\
+      4 << OPD_F_OD_LSB, F(FLD_SVE_Zt),					\
+      "a list of 4 SVE vector registers")

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions.
  2024-01-15  9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
                   ` (3 preceding siblings ...)
  2024-01-15  9:40 ` [PATCH 6/6][Binutils] aarch64: Add SVE2.1 Contiguous load/store instructions Srinath Parvathaneni
@ 2024-01-15 11:46 ` Nick Clifton
  4 siblings, 0 replies; 7+ messages in thread
From: Nick Clifton @ 2024-01-15 11:46 UTC (permalink / raw)
  To: Srinath Parvathaneni, binutils; +Cc: richard.earnshaw

Hi Srinath,

> This patch add support for SVE2.1 and SME2.1 non-widening BFloat16
> (FEAT_B16B16) instructions.
> 
> Following instructions predicated, unpredicated and indexed
> variants are added in this patch.
> 
> bfadd, bfclamp, bfmax bfmaxnm, bfmin,bfminnm,
> bfmla,bfmls,bfmul and bfsub.
> 
> Regression testing for aarch64-none-elf target and found no regressions.
> 
> Ok for binutils-master?

Patch series approved and applied.

Cheers
   Nick



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-01-15 11:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-15  9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
2024-01-15  9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
2024-01-15  9:35   ` [PATCH 3/6][Binutils] aarch64: Add support for FEAT_SVE2p1 Srinath Parvathaneni
2024-01-15  9:37 ` [PATCH 4/6][Binutils] aarch64: Add SVE2.1 dupq, eorqv and extq instructions Srinath Parvathaneni
2024-01-15  9:38 ` PATCH 5/6][Binutils] aarch64: Add SVE2.1 fmin and fmax instructions Srinath Parvathaneni
2024-01-15  9:40 ` [PATCH 6/6][Binutils] aarch64: Add SVE2.1 Contiguous load/store instructions Srinath Parvathaneni
2024-01-15 11:46 ` [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Nick Clifton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).