* [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions.
@ 2024-01-15 9:28 Srinath Parvathaneni
2024-01-15 9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15 9:28 UTC (permalink / raw)
To: binutils; +Cc: richard.earnshaw, nickc
[-- Attachment #1: Type: text/plain, Size: 388 bytes --]
Hi,
This patch add support for SVE2.1 and SME2.1 non-widening BFloat16
(FEAT_B16B16) instructions.
Following instructions predicated, unpredicated and indexed
variants are added in this patch.
bfadd, bfclamp, bfmax bfmaxnm, bfmin,bfminnm,
bfmla,bfmls,bfmul and bfsub.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
[-- Attachment #2: 1_6.patch --]
[-- Type: text/x-patch, Size: 21502 bytes --]
diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 7eb732adbb6c85fdf4db7c4b14d0be5fafa370b6..bc40d126632e093b02268fd7474f4cf0c6ddf6d7 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -10335,6 +10335,7 @@ static const struct aarch64_option_cpu_value_table aarch64_features[] = {
{"ite", AARCH64_FEATURE (ITE), AARCH64_NO_FEATURES},
{"d128", AARCH64_FEATURE (D128),
AARCH64_FEATURE (LSE128)},
+ {"b16b16", AARCH64_FEATURE (B16B16), AARCH64_FEATURE (SVE2)},
{NULL, AARCH64_NO_FEATURES, AARCH64_NO_FEATURES},
};
diff --git a/gas/testsuite/gas/aarch64/bfloat16-1.d b/gas/testsuite/gas/aarch64/bfloat16-1.d
new file mode 100644
index 0000000000000000000000000000000000000000..f0d436bec585ff2aee2e007d63fc672a11a569b9
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/bfloat16-1.d
@@ -0,0 +1,106 @@
+#name: Test of SVE2.1 and SME2.1 non-widening BFloat16 instructions.
+#as: -march=armv9.4-a+b16b16
+#objdump: -dr
+
+[^:]+: file format .*
+
+
+[^:]+:
+
+[^:]+:
+.*: 65008200 bfadd z0.h, p0\/m, z0.h, z16.h
+.*: 65008501 bfadd z1.h, p1\/m, z1.h, z8.h
+.*: 65008882 bfadd z2.h, p2\/m, z2.h, z4.h
+.*: 65009044 bfadd z4.h, p4\/m, z4.h, z2.h
+.*: 65009828 bfadd z8.h, p6\/m, z8.h, z1.h
+.*: 65009c10 bfadd z16.h, p7\/m, z16.h, z0.h
+.*: 65068200 bfmax z0.h, p0\/m, z0.h, z16.h
+.*: 65068501 bfmax z1.h, p1\/m, z1.h, z8.h
+.*: 65068882 bfmax z2.h, p2\/m, z2.h, z4.h
+.*: 65069044 bfmax z4.h, p4\/m, z4.h, z2.h
+.*: 65069828 bfmax z8.h, p6\/m, z8.h, z1.h
+.*: 65069c10 bfmax z16.h, p7\/m, z16.h, z0.h
+.*: 65048200 bfmaxnm z0.h, p0\/m, z0.h, z16.h
+.*: 65048501 bfmaxnm z1.h, p1\/m, z1.h, z8.h
+.*: 65048882 bfmaxnm z2.h, p2\/m, z2.h, z4.h
+.*: 65049044 bfmaxnm z4.h, p4\/m, z4.h, z2.h
+.*: 65049828 bfmaxnm z8.h, p6\/m, z8.h, z1.h
+.*: 65049c10 bfmaxnm z16.h, p7\/m, z16.h, z0.h
+.*: 65078200 bfmin z0.h, p0\/m, z0.h, z16.h
+.*: 65078501 bfmin z1.h, p1\/m, z1.h, z8.h
+.*: 65078882 bfmin z2.h, p2\/m, z2.h, z4.h
+.*: 65079044 bfmin z4.h, p4\/m, z4.h, z2.h
+.*: 65079828 bfmin z8.h, p6\/m, z8.h, z1.h
+.*: 65079c10 bfmin z16.h, p7\/m, z16.h, z0.h
+.*: 65058200 bfminnm z0.h, p0\/m, z0.h, z16.h
+.*: 65058501 bfminnm z1.h, p1\/m, z1.h, z8.h
+.*: 65058882 bfminnm z2.h, p2\/m, z2.h, z4.h
+.*: 65059044 bfminnm z4.h, p4\/m, z4.h, z2.h
+.*: 65059828 bfminnm z8.h, p6\/m, z8.h, z1.h
+.*: 65059c10 bfminnm z16.h, p7\/m, z16.h, z0.h
+.*: 65100080 bfadd z0.h, z4.h, z16.h
+.*: 65080101 bfadd z1.h, z8.h, z8.h
+.*: 65040182 bfadd z2.h, z12.h, z4.h
+.*: 65020204 bfadd z4.h, z16.h, z2.h
+.*: 65010288 bfadd z8.h, z20.h, z1.h
+.*: 65000310 bfadd z16.h, z24.h, z0.h
+.*: 64302480 bfclamp z0.h, z4.h, z16.h
+.*: 64282501 bfclamp z1.h, z8.h, z8.h
+.*: 64242582 bfclamp z2.h, z12.h, z4.h
+.*: 64222604 bfclamp z4.h, z16.h, z2.h
+.*: 64212688 bfclamp z8.h, z20.h, z1.h
+.*: 64202710 bfclamp z16.h, z24.h, z0.h
+.*: 65300000 bfmla z0.h, p0\/m, z0.h, z16.h
+.*: 65280421 bfmla z1.h, p1\/m, z1.h, z8.h
+.*: 65240842 bfmla z2.h, p2\/m, z2.h, z4.h
+.*: 65221084 bfmla z4.h, p4\/m, z4.h, z2.h
+.*: 65211908 bfmla z8.h, p6\/m, z8.h, z1.h
+.*: 65201e10 bfmla z16.h, p7\/m, z16.h, z0.h
+.*: 643e0a00 bfmla z0.h, z16.h, z6.h\[7\]
+.*: 643d0901 bfmla z1.h, z8.h, z5.h\[7\]
+.*: 643409c2 bfmla z2.h, z14.h, z4.h\[5\]
+.*: 642a0aa4 bfmla z4.h, z21.h, z2.h\[3\]
+.*: 64210988 bfmla z8.h, z12.h, z1.h\[1\]
+.*: 64200950 bfmla z16.h, z10.h, z0.h\[1\]
+.*: 65302000 bfmls z0.h, p0\/m, z0.h, z16.h
+.*: 65282421 bfmls z1.h, p1\/m, z1.h, z8.h
+.*: 65242842 bfmls z2.h, p2\/m, z2.h, z4.h
+.*: 65223084 bfmls z4.h, p4\/m, z4.h, z2.h
+.*: 65213908 bfmls z8.h, p6\/m, z8.h, z1.h
+.*: 65203e10 bfmls z16.h, p7\/m, z16.h, z0.h
+.*: 643e0e00 bfmls z0.h, z16.h, z6.h\[7\]
+.*: 643d0d01 bfmls z1.h, z8.h, z5.h\[7\]
+.*: 64340dc2 bfmls z2.h, z14.h, z4.h\[5\]
+.*: 642a0ea4 bfmls z4.h, z21.h, z2.h\[3\]
+.*: 64210d88 bfmls z8.h, z12.h, z1.h\[1\]
+.*: 64200d50 bfmls z16.h, z10.h, z0.h\[1\]
+.*: 65028200 bfmul z0.h, p0\/m, z0.h, z16.h
+.*: 65028501 bfmul z1.h, p1\/m, z1.h, z8.h
+.*: 65028882 bfmul z2.h, p2\/m, z2.h, z4.h
+.*: 65029044 bfmul z4.h, p4\/m, z4.h, z2.h
+.*: 65029828 bfmul z8.h, p6\/m, z8.h, z1.h
+.*: 65029c10 bfmul z16.h, p7\/m, z16.h, z0.h
+.*: 65100880 bfmul z0.h, z4.h, z16.h
+.*: 65080901 bfmul z1.h, z8.h, z8.h
+.*: 65040982 bfmul z2.h, z12.h, z4.h
+.*: 65020a04 bfmul z4.h, z16.h, z2.h
+.*: 65010a88 bfmul z8.h, z20.h, z1.h
+.*: 65000b10 bfmul z16.h, z24.h, z0.h
+.*: 643e2a00 bfmul z0.h, z16.h, z6.h\[7\]
+.*: 643d2901 bfmul z1.h, z8.h, z5.h\[7\]
+.*: 643429c2 bfmul z2.h, z14.h, z4.h\[5\]
+.*: 642a2aa4 bfmul z4.h, z21.h, z2.h\[3\]
+.*: 64212988 bfmul z8.h, z12.h, z1.h\[1\]
+.*: 64202950 bfmul z16.h, z10.h, z0.h\[1\]
+.*: 65018200 bfsub z0.h, p0\/m, z0.h, z16.h
+.*: 65018501 bfsub z1.h, p1\/m, z1.h, z8.h
+.*: 65018882 bfsub z2.h, p2\/m, z2.h, z4.h
+.*: 65019044 bfsub z4.h, p4\/m, z4.h, z2.h
+.*: 65019828 bfsub z8.h, p6\/m, z8.h, z1.h
+.*: 65019c10 bfsub z16.h, p7\/m, z16.h, z0.h
+.*: 65100480 bfsub z0.h, z4.h, z16.h
+.*: 65080501 bfsub z1.h, z8.h, z8.h
+.*: 65040582 bfsub z2.h, z12.h, z4.h
+.*: 65020604 bfsub z4.h, z16.h, z2.h
+.*: 65010688 bfsub z8.h, z20.h, z1.h
+.*: 65000710 bfsub z16.h, z24.h, z0.h
diff --git a/gas/testsuite/gas/aarch64/bfloat16-1.s b/gas/testsuite/gas/aarch64/bfloat16-1.s
new file mode 100644
index 0000000000000000000000000000000000000000..5597d9ef01906f7316149cdf0bb69addeb849926
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/bfloat16-1.s
@@ -0,0 +1,112 @@
+bfadd z0.h, p0/m, z0.h, z16.h
+bfadd z1.h, p1/m, z1.h, z8.h
+bfadd z2.h, p2/m, z2.h, z4.h
+bfadd z4.h, p4/m, z4.h, z2.h
+bfadd z8.h, p6/m, z8.h, z1.h
+bfadd z16.h, p7/m, z16.h, z0.h
+
+bfmax z0.h, p0/m, z0.h, z16.h
+bfmax z1.h, p1/m, z1.h, z8.h
+bfmax z2.h, p2/m, z2.h, z4.h
+bfmax z4.h, p4/m, z4.h, z2.h
+bfmax z8.h, p6/m, z8.h, z1.h
+bfmax z16.h, p7/m, z16.h, z0.h
+
+bfmaxnm z0.h, p0/m, z0.h, z16.h
+bfmaxnm z1.h, p1/m, z1.h, z8.h
+bfmaxnm z2.h, p2/m, z2.h, z4.h
+bfmaxnm z4.h, p4/m, z4.h, z2.h
+bfmaxnm z8.h, p6/m, z8.h, z1.h
+bfmaxnm z16.h, p7/m, z16.h, z0.h
+
+bfmin z0.h, p0/m, z0.h, z16.h
+bfmin z1.h, p1/m, z1.h, z8.h
+bfmin z2.h, p2/m, z2.h, z4.h
+bfmin z4.h, p4/m, z4.h, z2.h
+bfmin z8.h, p6/m, z8.h, z1.h
+bfmin z16.h, p7/m, z16.h, z0.h
+
+bfminnm z0.h, p0/m, z0.h, z16.h
+bfminnm z1.h, p1/m, z1.h, z8.h
+bfminnm z2.h, p2/m, z2.h, z4.h
+bfminnm z4.h, p4/m, z4.h, z2.h
+bfminnm z8.h, p6/m, z8.h, z1.h
+bfminnm z16.h, p7/m, z16.h, z0.h
+
+bfadd z0.h, z4.h, z16.h
+bfadd z1.h, z8.h, z8.h
+bfadd z2.h, z12.h, z4.h
+bfadd z4.h, z16.h, z2.h
+bfadd z8.h, z20.h, z1.h
+bfadd z16.h, z24.h, z0.h
+
+bfclamp z0.h, z4.h, z16.h
+bfclamp z1.h, z8.h, z8.h
+bfclamp z2.h, z12.h, z4.h
+bfclamp z4.h, z16.h, z2.h
+bfclamp z8.h, z20.h, z1.h
+bfclamp z16.h, z24.h, z0.h
+bfmla z0.h, p0/m, z0.h, z16.h
+bfmla z1.h, p1/m, z1.h, z8.h
+bfmla z2.h, p2/m, z2.h, z4.h
+bfmla z4.h, p4/m, z4.h, z2.h
+bfmla z8.h, p6/m, z8.h, z1.h
+bfmla z16.h, p7/m, z16.h, z0.h
+
+bfmla z0.h, z16.h, z6.h[7]
+bfmla z1.h, z8.h, z5.h[6]
+bfmla z2.h, z14.h, z4.h[4]
+bfmla z4.h, z21.h, z2.h[2]
+bfmla z8.h, z12.h, z1.h[1]
+bfmla z16.h, z10.h, z0.h[0]
+
+bfmls z0.h, p0/m, z0.h, z16.h
+bfmls z1.h, p1/m, z1.h, z8.h
+bfmls z2.h, p2/m, z2.h, z4.h
+bfmls z4.h, p4/m, z4.h, z2.h
+bfmls z8.h, p6/m, z8.h, z1.h
+bfmls z16.h, p7/m, z16.h, z0.h
+
+bfmls z0.h, z16.h, z6.h[7]
+bfmls z1.h, z8.h, z5.h[6]
+bfmls z2.h, z14.h, z4.h[4]
+bfmls z4.h, z21.h, z2.h[2]
+bfmls z8.h, z12.h, z1.h[1]
+bfmls z16.h, z10.h, z0.h[0]
+
+bfmul z0.h, p0/m, z0.h, z16.h
+bfmul z1.h, p1/m, z1.h, z8.h
+bfmul z2.h, p2/m, z2.h, z4.h
+bfmul z4.h, p4/m, z4.h, z2.h
+bfmul z8.h, p6/m, z8.h, z1.h
+bfmul z16.h, p7/m, z16.h, z0.h
+
+bfmul z0.h, z4.h, z16.h
+bfmul z1.h, z8.h, z8.h
+bfmul z2.h, z12.h, z4.h
+bfmul z4.h, z16.h, z2.h
+bfmul z8.h, z20.h, z1.h
+bfmul z16.h, z24.h, z0.h
+
+bfmul z0.h, z16.h, z6.h[7]
+bfmul z1.h, z8.h, z5.h[6]
+bfmul z2.h, z14.h, z4.h[4]
+bfmul z4.h, z21.h, z2.h[2]
+bfmul z8.h, z12.h, z1.h[1]
+bfmul z16.h, z10.h, z0.h[0]
+
+bfsub z0.h, p0/m, z0.h, z16.h
+bfsub z1.h, p1/m, z1.h, z8.h
+bfsub z2.h, p2/m, z2.h, z4.h
+bfsub z4.h, p4/m, z4.h, z2.h
+bfsub z8.h, p6/m, z8.h, z1.h
+bfsub z16.h, p7/m, z16.h, z0.h
+
+bfsub z0.h, z4.h, z16.h
+bfsub z1.h, z8.h, z8.h
+bfsub z2.h, z12.h, z4.h
+bfsub z4.h, z16.h, z2.h
+bfsub z8.h, z20.h, z1.h
+bfsub z16.h, z24.h, z0.h
+
+
diff --git a/gas/testsuite/gas/aarch64/bfloat16-bad.d b/gas/testsuite/gas/aarch64/bfloat16-bad.d
new file mode 100644
index 0000000000000000000000000000000000000000..10d2b001c1a39851ab020e20997f2774663dc3ba
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/bfloat16-bad.d
@@ -0,0 +1,4 @@
+#name: Negative test of Bfloat16 instructions.
+#as: -march=armv9.4-a
+#source: bfloat16-1.s
+#error_output: bfloat16-bad.l
diff --git a/gas/testsuite/gas/aarch64/bfloat16-bad.l b/gas/testsuite/gas/aarch64/bfloat16-bad.l
new file mode 100644
index 0000000000000000000000000000000000000000..5a5192b329cd250914c860de5331ef3952ef846b
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/bfloat16-bad.l
@@ -0,0 +1,97 @@
+.*: Assembler messages:
+.*: Error: selected processor does not support `bfadd z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfadd z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfadd z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfadd z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfadd z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfadd z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmax z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmax z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmax z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmax z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmax z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmax z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmaxnm z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmaxnm z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmaxnm z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmaxnm z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmaxnm z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmaxnm z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmin z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmin z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmin z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmin z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmin z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmin z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfminnm z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfminnm z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfminnm z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfminnm z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfminnm z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfminnm z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfadd z0.h,z4.h,z16.h'
+.*: Error: selected processor does not support `bfadd z1.h,z8.h,z8.h'
+.*: Error: selected processor does not support `bfadd z2.h,z12.h,z4.h'
+.*: Error: selected processor does not support `bfadd z4.h,z16.h,z2.h'
+.*: Error: selected processor does not support `bfadd z8.h,z20.h,z1.h'
+.*: Error: selected processor does not support `bfadd z16.h,z24.h,z0.h'
+.*: Error: selected processor does not support `bfclamp z0.h,z4.h,z16.h'
+.*: Error: selected processor does not support `bfclamp z1.h,z8.h,z8.h'
+.*: Error: selected processor does not support `bfclamp z2.h,z12.h,z4.h'
+.*: Error: selected processor does not support `bfclamp z4.h,z16.h,z2.h'
+.*: Error: selected processor does not support `bfclamp z8.h,z20.h,z1.h'
+.*: Error: selected processor does not support `bfclamp z16.h,z24.h,z0.h'
+.*: Error: selected processor does not support `bfmla z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmla z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmla z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmla z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmla z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmla z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmla z0.h,z16.h,z6.h\[7\]'
+.*: Error: selected processor does not support `bfmla z1.h,z8.h,z5.h\[6\]'
+.*: Error: selected processor does not support `bfmla z2.h,z14.h,z4.h\[4\]'
+.*: Error: selected processor does not support `bfmla z4.h,z21.h,z2.h\[2\]'
+.*: Error: selected processor does not support `bfmla z8.h,z12.h,z1.h\[1\]'
+.*: Error: selected processor does not support `bfmla z16.h,z10.h,z0.h\[0\]'
+.*: Error: selected processor does not support `bfmls z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmls z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmls z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmls z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmls z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmls z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmls z0.h,z16.h,z6.h\[7\]'
+.*: Error: selected processor does not support `bfmls z1.h,z8.h,z5.h\[6\]'
+.*: Error: selected processor does not support `bfmls z2.h,z14.h,z4.h\[4\]'
+.*: Error: selected processor does not support `bfmls z4.h,z21.h,z2.h\[2\]'
+.*: Error: selected processor does not support `bfmls z8.h,z12.h,z1.h\[1\]'
+.*: Error: selected processor does not support `bfmls z16.h,z10.h,z0.h\[0\]'
+.*: Error: selected processor does not support `bfmul z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfmul z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfmul z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfmul z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfmul z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfmul z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfmul z0.h,z4.h,z16.h'
+.*: Error: selected processor does not support `bfmul z1.h,z8.h,z8.h'
+.*: Error: selected processor does not support `bfmul z2.h,z12.h,z4.h'
+.*: Error: selected processor does not support `bfmul z4.h,z16.h,z2.h'
+.*: Error: selected processor does not support `bfmul z8.h,z20.h,z1.h'
+.*: Error: selected processor does not support `bfmul z16.h,z24.h,z0.h'
+.*: Error: selected processor does not support `bfmul z0.h,z16.h,z6.h\[7\]'
+.*: Error: selected processor does not support `bfmul z1.h,z8.h,z5.h\[6\]'
+.*: Error: selected processor does not support `bfmul z2.h,z14.h,z4.h\[4\]'
+.*: Error: selected processor does not support `bfmul z4.h,z21.h,z2.h\[2\]'
+.*: Error: selected processor does not support `bfmul z8.h,z12.h,z1.h\[1\]'
+.*: Error: selected processor does not support `bfmul z16.h,z10.h,z0.h\[0\]'
+.*: Error: selected processor does not support `bfsub z0.h,p0\/m,z0.h,z16.h'
+.*: Error: selected processor does not support `bfsub z1.h,p1\/m,z1.h,z8.h'
+.*: Error: selected processor does not support `bfsub z2.h,p2\/m,z2.h,z4.h'
+.*: Error: selected processor does not support `bfsub z4.h,p4\/m,z4.h,z2.h'
+.*: Error: selected processor does not support `bfsub z8.h,p6\/m,z8.h,z1.h'
+.*: Error: selected processor does not support `bfsub z16.h,p7\/m,z16.h,z0.h'
+.*: Error: selected processor does not support `bfsub z0.h,z4.h,z16.h'
+.*: Error: selected processor does not support `bfsub z1.h,z8.h,z8.h'
+.*: Error: selected processor does not support `bfsub z2.h,z12.h,z4.h'
+.*: Error: selected processor does not support `bfsub z4.h,z16.h,z2.h'
+.*: Error: selected processor does not support `bfsub z8.h,z20.h,z1.h'
+.*: Error: selected processor does not support `bfsub z16.h,z24.h,z0.h'
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index 9d64d7a0ebefa4014f30a46c5be7bda124666327..e2ca92361b46a27f67d315d155eb3a9608176cb7 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -222,6 +222,8 @@ enum aarch64_feature_bit {
AARCH64_FEATURE_PMUv3_ICNTR,
/* Performance Monitors Synchronous-Exception-Based Event Extension. */
AARCH64_FEATURE_SEBEP,
+ /* SVE2.1 and SME2.1 non-widening BFloat16 instructions. */
+ AARCH64_FEATURE_B16B16,
AARCH64_NUM_FEATURES
};
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index 0cf195d03216a38e1a9b5e06b80af064e2440b91..a8ccdafd044efd62d11ba1e4c199792f6dd44559 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1761,6 +1761,10 @@
{ \
QLF3(S_S,NIL,S_S), \
}
+#define OP_SVE_SMSS \
+{ \
+ QLF4(S_H,P_M,S_H,S_H), \
+}
#define OP_SVE_SUU \
{ \
QLF3(S_S,NIL,NIL), \
@@ -2608,6 +2612,8 @@ static const aarch64_feature_set aarch64_feature_the =
AARCH64_FEATURE (THE);
static const aarch64_feature_set aarch64_feature_d128_the =
AARCH64_FEATURES (2, D128, THE);
+static const aarch64_feature_set aarch64_feature_b16b16 =
+ AARCH64_FEATURE (B16B16);
#define CORE &aarch64_feature_v8
#define FP &aarch64_feature_fp
@@ -2670,6 +2676,7 @@ static const aarch64_feature_set aarch64_feature_d128_the =
#define D128 &aarch64_feature_d128
#define THE &aarch64_feature_the
#define D128_THE &aarch64_feature_d128_the
+#define B16B16 &aarch64_feature_b16b16
#define CORE_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS) \
{ NAME, OPCODE, MASK, CLASS, OP, CORE, OPS, QUALS, FLAGS, 0, 0, NULL }
@@ -2739,6 +2746,12 @@ static const aarch64_feature_set aarch64_feature_d128_the =
#define SVE2_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
{ NAME, OPCODE, MASK, CLASS, OP, SVE2, OPS, QUALS, \
FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
+#define B16B16_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
+ { NAME, OPCODE, MASK, CLASS, OP, B16B16, OPS, QUALS, \
+ FLAGS | F_STRICT, 0, TIED, NULL }
+#define B16B16_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
+ { NAME, OPCODE, MASK, CLASS, OP, B16B16, OPS, QUALS, \
+ FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
#define SVE2AES_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
{ NAME, OPCODE, MASK, CLASS, OP, SVE2_AES, OPS, QUALS, \
FLAGS | F_STRICT, 0, TIED, NULL }
@@ -6258,6 +6271,24 @@ const struct aarch64_opcode aarch64_opcode_table[] =
D128_THE_INSN("rcwsswppal", 0x59e0a000, 0xffe0fc00, OP3 (Rt, Rs, ADDR_SIMPLE), QL_X2NIL, 0),
D128_THE_INSN("rcwsswppl", 0x5960a000, 0xffe0fc00, OP3 (Rt, Rs, ADDR_SIMPLE), QL_X2NIL, 0),
+/* BFloat16 SVE Instructions. */
+ B16B16_INSNC("bfadd", 0x65008000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+ B16B16_INSNC("bfmax", 0x65068000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+ B16B16_INSNC("bfmaxnm", 0x65048000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+ B16B16_INSNC("bfmin", 0x65078000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+ B16B16_INSNC("bfminnm", 0x65058000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+ B16B16_INSNC("bfmla", 0x65200000, 0xffe0e000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zn, SVE_Zm_16), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+ B16B16_INSNC("bfmls", 0x65202000, 0xffe0e000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zn, SVE_Zm_16), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+ B16B16_INSN("bfadd", 0x65000000, 0xffe0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_HHH, 0, 0),
+ B16B16_INSN("bfclamp", 0x64202400, 0xffe0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_HHH, 0, 0),
+ B16B16_INSNC("bfmul", 0x65028000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+ B16B16_INSN("bfmul", 0x65000800, 0xffe0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_HHH, 0, 0),
+ B16B16_INSNC("bfsub", 0x65018000, 0xffffe000, sve_misc, 0, OP4 (SVE_Zd, SVE_Pg3, SVE_Zd, SVE_Zm_5), OP_SVE_SMSS, 0, C_SCAN_MOVPRFX, 0),
+ B16B16_INSN("bfsub", 0x65000400, 0xffe0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_HHH, 0, 0),
+ B16B16_INSN("bfmla", 0x64200800, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
+ B16B16_INSN("bfmls", 0x64200c00, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
+ B16B16_INSN("bfmul", 0x64202800, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
+
{0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
};
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions.
2024-01-15 9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
@ 2024-01-15 9:34 ` Srinath Parvathaneni
2024-01-15 9:35 ` [PATCH 3/6][Binutils] aarch64: Add support for FEAT_SVE2p1 Srinath Parvathaneni
2024-01-15 9:37 ` [PATCH 4/6][Binutils] aarch64: Add SVE2.1 dupq, eorqv and extq instructions Srinath Parvathaneni
` (3 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15 9:34 UTC (permalink / raw)
To: binutils; +Cc: richard.earnshaw, nickc
[-- Attachment #1: Type: text/plain, Size: 374 bytes --]
Hi,
This patch add support for FEAT_SME2p1 and "movaz" instructions
along with the optional flag +sme2p1.
Following "movaz" instructions are add:
Move and zero two ZA tile slices to vector registers.
Move and zero four ZA tile slices to vector registers.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
[-- Attachment #2: 2_6.patch --]
[-- Type: text/x-patch, Size: 24758 bytes --]
diff --git a/gas/NEWS b/gas/NEWS
index 74df7e61349626926bec1626aa5d89629c5d6c4a..d2c5c0641c4392d66472e535e5f51756fc5d511f 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,8 @@
-*- text -*-
+* Add support for the AArch64 Scalable Matrix Extension version 2.1 (SME2.1)
+ instructions.
+
* Add support for 'armv8.9-a' and 'armv9.4-a' for -march in Arm GAS.
* Initial support for Intel APX: 32 GPRs, NDD, PUSH2/POP2 and PUSHP/POPP.
diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index bc40d126632e093b02268fd7474f4cf0c6ddf6d7..34159c2168b78fe12d4a549678ae77be88e50313 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -4492,6 +4492,7 @@ parse_sme_immediate (char **str, int64_t *imm)
[<Wv>, <imm>]
[<Wv>, #<imm>]
+ [<Ws>, <offsf>:<offsl>]
Return true on success, populating OPND with the parsed index. */
@@ -4592,6 +4593,7 @@ parse_sme_za_index (char **str, struct aarch64_indexed_za *opnd)
<Pm>.<T>[<Wv>< #<imm>]
ZA[<Wv>, #<imm>]
<ZAn><HV>.<T>[<Wv>, #<imm>]
+ <ZAn><HV>.<T>[<Ws>, <offsf>:<offsl>]
FLAGS is as for parse_typed_reg. */
@@ -7865,6 +7867,21 @@ parse_operands (char *str, const aarch64_opcode *opcode)
info->qualifier = qualifier;
break;
+ case AARCH64_OPND_SME_ZA_array_vrsb_1:
+ case AARCH64_OPND_SME_ZA_array_vrsh_1:
+ case AARCH64_OPND_SME_ZA_array_vrss_1:
+ case AARCH64_OPND_SME_ZA_array_vrsd_1:
+ case AARCH64_OPND_SME_ZA_array_vrsb_2:
+ case AARCH64_OPND_SME_ZA_array_vrsh_2:
+ case AARCH64_OPND_SME_ZA_array_vrss_2:
+ case AARCH64_OPND_SME_ZA_array_vrsd_2:
+ if (!parse_dual_indexed_reg (&str, REG_TYPE_ZATHV,
+ &info->indexed_za, &qualifier, 0))
+ goto failure;
+ info->qualifier = qualifier;
+ break;
+
+
case AARCH64_OPND_SME_VLxN_10:
case AARCH64_OPND_SME_VLxN_13:
po_strict_enum_or_fail (aarch64_sme_vlxn_array);
@@ -10336,6 +10353,7 @@ static const struct aarch64_option_cpu_value_table aarch64_features[] = {
{"d128", AARCH64_FEATURE (D128),
AARCH64_FEATURE (LSE128)},
{"b16b16", AARCH64_FEATURE (B16B16), AARCH64_FEATURE (SVE2)},
+ {"sme2p1", AARCH64_FEATURE (SME2p1), AARCH64_FEATURE (SME2)},
{NULL, AARCH64_NO_FEATURES, AARCH64_NO_FEATURES},
};
diff --git a/gas/doc/c-aarch64.texi b/gas/doc/c-aarch64.texi
index ccf18ee2661fed0bb89608d7f122556cb541e5b5..1f3a4fbcafbab652d70f202b33527ae3dde7e1bb 100644
--- a/gas/doc/c-aarch64.texi
+++ b/gas/doc/c-aarch64.texi
@@ -276,6 +276,8 @@ automatically cause those extensions to be disabled.
@tab Enable TRCIT instruction.
@item @code{d128} @tab Armv9.4-A @tab No
@tab Enable the 128-bit Page Descriptor Extension. This implies @code{lse128}.
+@item @code{sme2p1} @tab N/A @tab No
+ @tab Enable the SME2.1 Extension.
@end multitable
@node AArch64 Syntax
diff --git a/gas/testsuite/gas/aarch64/sme2p1-1.d b/gas/testsuite/gas/aarch64/sme2p1-1.d
new file mode 100644
index 0000000000000000000000000000000000000000..a6e7b7664024e7f03ddd1d8ece9d6c3bd1c79042
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sme2p1-1.d
@@ -0,0 +1,42 @@
+#name: Test of SME2.1 movaz instructions.
+#as: -march=armv9.4-a+sme2p1
+#objdump: -dr
+
+[^:]+: file format .*
+
+
+[^:]+:
+
+[^:]+:
+.*: c006c260 movaz {z0.b-z1.b}, za0v.b \[w14, 6:7\]
+.*: c046c260 movaz {z0.h-z1.h}, za0v.h \[w14, 6:7\]
+.*: c086c220 movaz {z0.s-z1.s}, za0v.s \[w14, 2:3\]
+.*: c0c6c200 movaz {z0.d-z1.d}, za0v.d \[w14, 0:1\]
+.*: c00602e0 movaz {z0.b-z1.b}, za0h.b \[w12, 14:15\]
+.*: c0462260 movaz {z0.h-z1.h}, za0h.h \[w13, 6:7\]
+.*: c0864220 movaz {z0.s-z1.s}, za0h.s \[w14, 2:3\]
+.*: c0c66200 movaz {z0.d-z1.d}, za0h.d \[w15, 0:1\]
+.*: c006c260 movaz {z0.b-z1.b}, za0v.b \[w14, 6:7\]
+.*: c046c2e0 movaz {z0.h-z1.h}, za1v.h \[w14, 6:7\]
+.*: c086c2a0 movaz {z0.s-z1.s}, za2v.s \[w14, 2:3\]
+.*: c0c6c260 movaz {z0.d-z1.d}, za3v.d \[w14, 0:1\]
+.*: c00602e0 movaz {z0.b-z1.b}, za0h.b \[w12, 14:15\]
+.*: c04622e0 movaz {z0.h-z1.h}, za1h.h \[w13, 6:7\]
+.*: c08642a0 movaz {z0.s-z1.s}, za2h.s \[w14, 2:3\]
+.*: c0c66260 movaz {z0.d-z1.d}, za3h.d \[w15, 0:1\]
+.*: c006c660 movaz {z0.b-z3.b}, za0v.b \[w14, 12:15\]
+.*: c046c620 movaz {z0.h-z3.h}, za0v.h \[w14, 4:7\]
+.*: c086c600 movaz {z0.s-z3.s}, za0v.s \[w14, 0:3\]
+.*: c0c6c600 movaz {z0.d-z3.d}, za0v.d \[w14, 0:3\]
+.*: c0060660 movaz {z0.b-z3.b}, za0h.b \[w12, 12:15\]
+.*: c0462620 movaz {z0.h-z3.h}, za0h.h \[w13, 4:7\]
+.*: c0864600 movaz {z0.s-z3.s}, za0h.s \[w14, 0:3\]
+.*: c0c66600 movaz {z0.d-z3.d}, za0h.d \[w15, 0:3\]
+.*: c006c640 movaz {z0.b-z3.b}, za0v.b \[w14, 8:11\]
+.*: c046c660 movaz {z0.h-z3.h}, za1v.h \[w14, 4:7\]
+.*: c086c640 movaz {z0.s-z3.s}, za2v.s \[w14, 0:3\]
+.*: c0c6c660 movaz {z0.d-z3.d}, za3v.d \[w14, 0:3\]
+.*: c0060660 movaz {z0.b-z3.b}, za0h.b \[w12, 12:15\]
+.*: c0462660 movaz {z0.h-z3.h}, za1h.h \[w13, 4:7\]
+.*: c0864640 movaz {z0.s-z3.s}, za2h.s \[w14, 0:3\]
+.*: c0c66660 movaz {z0.d-z3.d}, za3h.d \[w15, 0:3\]
diff --git a/gas/testsuite/gas/aarch64/sme2p1-1.s b/gas/testsuite/gas/aarch64/sme2p1-1.s
new file mode 100644
index 0000000000000000000000000000000000000000..77481d4b874b4688e10c794e6ea9e1ff0c81ef3d
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sme2p1-1.s
@@ -0,0 +1,39 @@
+ movaz {z0.b - z1.b}, ZA0V.B [w14, 6:7]
+ movaz {z0.h - z1.h}, ZA0V.H [w14, 6:7]
+ movaz {z0.s - z1.s}, ZA0V.S [w14, 2:3]
+ movaz {z0.d - z1.d}, ZA0V.D [w14, 0:1]
+
+ movaz {z0.b - z1.b}, ZA0H.B [w12, 14:15]
+ movaz {z0.h - z1.h}, ZA0H.H [w13, 6:7]
+ movaz {z0.s - z1.s}, ZA0H.S [w14, 2:3]
+ movaz {z0.d - z1.d}, ZA0H.D [w15, 0:1]
+
+ movaz {z0.b - z1.b}, ZA0V.B [w14, 6:7]
+ movaz {z0.h - z1.h}, ZA1V.H [w14, 6:7]
+ movaz {z0.s - z1.s}, ZA2V.S [w14, 2:3]
+ movaz {z0.d - z1.d}, ZA3V.D [w14, 0:1]
+
+ movaz {z0.b - z1.b}, ZA0H.B [w12, 14:15]
+ movaz {z0.h - z1.h}, ZA1H.H [w13, 6:7]
+ movaz {z0.s - z1.s}, ZA2H.S [w14, 2:3]
+ movaz {z0.d - z1.d}, ZA3H.D [w15, 0:1]
+
+ movaz {z0.b - z3.b}, ZA0V.B [w14, 12:15]
+ movaz {z0.h - z3.h}, ZA0V.H [w14, 4:7]
+ movaz {z0.s - z3.s}, ZA0V.S [w14, 0:3]
+ movaz {z0.d - z3.d}, ZA0V.D [w14, 0:3]
+
+ movaz {z0.b - z3.b}, ZA0H.B [w12, 12:15]
+ movaz {z0.h - z3.h}, ZA0H.H [w13, 4:7]
+ movaz {z0.s - z3.s}, ZA0H.S [w14, 0:3]
+ movaz {z0.d - z3.d}, ZA0H.D [w15, 0:3]
+
+ movaz {z0.b - z3.b}, ZA0V.B [w14, 8:11]
+ movaz {z0.h - z3.h}, ZA1V.H [w14, 4:7]
+ movaz {z0.s - z3.s}, ZA2V.S [w14, 0:3]
+ movaz {z0.d - z3.d}, ZA3V.D [w14, 0:3]
+
+ movaz {z0.b - z3.b}, ZA0H.B [w12, 12:15]
+ movaz {z0.h - z3.h}, ZA1H.H [w13, 4:7]
+ movaz {z0.s - z3.s}, ZA2H.S [w14, 0:3]
+ movaz {z0.d - z3.d}, ZA3H.D [w15, 0:3]
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index e2ca92361b46a27f67d315d155eb3a9608176cb7..648e25f3e4242bb738eee5f62079838784223b8a 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -224,6 +224,8 @@ enum aarch64_feature_bit {
AARCH64_FEATURE_SEBEP,
/* SVE2.1 and SME2.1 non-widening BFloat16 instructions. */
AARCH64_FEATURE_B16B16,
+ /* SME2.1 instructions. */
+ AARCH64_FEATURE_SME2p1,
AARCH64_NUM_FEATURES
};
@@ -705,6 +707,14 @@ enum aarch64_opnd
AARCH64_OPND_SVE_Vd, /* Scalar SIMD&FP register in Vd. */
AARCH64_OPND_SVE_Vm, /* Scalar SIMD&FP register in Vm. */
AARCH64_OPND_SVE_Vn, /* Scalar SIMD&FP register in Vn. */
+ AARCH64_OPND_SME_ZA_array_vrsb_1, /* Tile to vector, two registers (B). */
+ AARCH64_OPND_SME_ZA_array_vrsh_1, /* Tile to vector, two registers (H). */
+ AARCH64_OPND_SME_ZA_array_vrss_1, /* Tile to vector, two registers (S). */
+ AARCH64_OPND_SME_ZA_array_vrsd_1, /* Tile to vector, two registers (D). */
+ AARCH64_OPND_SME_ZA_array_vrsb_2, /* Tile to vector, four registers (B). */
+ AARCH64_OPND_SME_ZA_array_vrsh_2, /* Tile to vector, four registers (H). */
+ AARCH64_OPND_SME_ZA_array_vrss_2, /* Tile to vector, four registers (S). */
+ AARCH64_OPND_SME_ZA_array_vrsd_2, /* Tile to vector, four registers (D). */
AARCH64_OPND_SVE_Za_5, /* SVE vector register in Za, bits [9,5]. */
AARCH64_OPND_SVE_Za_16, /* SVE vector register in Za, bits [20,16]. */
AARCH64_OPND_SVE_Zd, /* SVE vector register in Zd. */
@@ -962,6 +972,7 @@ enum aarch64_insn_class
sme_start,
sme_stop,
sme2_mov,
+ sme2_movaz,
sve_cpy,
sve_index,
sve_limm,
diff --git a/opcodes/aarch64-asm.h b/opcodes/aarch64-asm.h
index a3bf7bda0130f06623b0b56c18d9eca632b24a3a..d4b6407dc5de8d6e103ee8ca5b5f2c6bb814647f 100644
--- a/opcodes/aarch64-asm.h
+++ b/opcodes/aarch64-asm.h
@@ -100,6 +100,8 @@ AARCH64_DECL_OPD_INSERTER (ins_sve_strided_reglist);
AARCH64_DECL_OPD_INSERTER (ins_sve_scale);
AARCH64_DECL_OPD_INSERTER (ins_sve_shlimm);
AARCH64_DECL_OPD_INSERTER (ins_sve_shrimm);
+AARCH64_DECL_OPD_INSERTER (ins_sme_za_vrs1);
+AARCH64_DECL_OPD_INSERTER (ins_sme_za_vrs2);
AARCH64_DECL_OPD_INSERTER (ins_sme_za_hv_tiles);
AARCH64_DECL_OPD_INSERTER (ins_sme_za_hv_tiles_range);
AARCH64_DECL_OPD_INSERTER (ins_sme_za_list);
diff --git a/opcodes/aarch64-asm.c b/opcodes/aarch64-asm.c
index 1db290eea7e9d23893bfd2bd0a07c13392f6c9f6..3fac127a5899077e2ac19c5e98df737b8ffbe147 100644
--- a/opcodes/aarch64-asm.c
+++ b/opcodes/aarch64-asm.c
@@ -1375,6 +1375,76 @@ aarch64_ins_sve_float_zero_one (const aarch64_operand *self,
return true;
}
+bool
+aarch64_ins_sme_za_vrs1 (const aarch64_operand *self,
+ const aarch64_opnd_info *info,
+ aarch64_insn *code,
+ const aarch64_inst *inst ATTRIBUTE_UNUSED,
+ aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+ int za_reg = info->indexed_za.regno;
+ int regno = info->indexed_za.index.regno & 3;
+ int imm = info->indexed_za.index.imm;
+ int v = info->indexed_za.v;
+ int countm1 = info->indexed_za.index.countm1;
+
+ insert_field (self->fields[0], code, v, 0);
+ insert_field (self->fields[1], code, regno, 0);
+ switch (info->qualifier)
+ {
+ case AARCH64_OPND_QLF_S_B:
+ insert_field (self->fields[2], code, imm / (countm1 + 1), 0);
+ break;
+ case AARCH64_OPND_QLF_S_H:
+ case AARCH64_OPND_QLF_S_S:
+ insert_field (self->fields[2], code, za_reg, 0);
+ insert_field (self->fields[3], code, imm / (countm1 + 1), 0);
+ break;
+ case AARCH64_OPND_QLF_S_D:
+ insert_field (self->fields[2], code, za_reg, 0);
+ break;
+ default:
+ return false;
+ }
+
+ return true;
+}
+
+bool
+aarch64_ins_sme_za_vrs2 (const aarch64_operand *self,
+ const aarch64_opnd_info *info,
+ aarch64_insn *code,
+ const aarch64_inst *inst ATTRIBUTE_UNUSED,
+ aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+ int za_reg = info->indexed_za.regno;
+ int regno = info->indexed_za.index.regno & 3;
+ int imm = info->indexed_za.index.imm;
+ int v = info->indexed_za.v;
+ int countm1 = info->indexed_za.index.countm1;
+
+ insert_field (self->fields[0], code, v, 0);
+ insert_field (self->fields[1], code, regno, 0);
+ switch (info->qualifier)
+ {
+ case AARCH64_OPND_QLF_S_B:
+ insert_field (self->fields[2], code, imm / (countm1 + 1), 0);
+ break;
+ case AARCH64_OPND_QLF_S_H:
+ insert_field (self->fields[2], code, za_reg, 0);
+ insert_field (self->fields[3], code, imm / (countm1 + 1), 0);
+ break;
+ case AARCH64_OPND_QLF_S_S:
+ case AARCH64_OPND_QLF_S_D:
+ insert_field (self->fields[2], code, za_reg, 0);
+ break;
+ default:
+ return false;
+ }
+
+ return true;
+}
+
/* Encode in SME instruction such as MOVA ZA tile vector register number,
vector indicator, vector selector and immediate. */
bool
@@ -2011,6 +2081,7 @@ aarch64_encode_variant_using_iclass (struct aarch64_inst *inst)
break;
case sme_misc:
+ case sme2_movaz:
case sve_misc:
/* These instructions have only a single variant. */
break;
diff --git a/opcodes/aarch64-dis.h b/opcodes/aarch64-dis.h
index 20387db7b39e98c081ff88c8c73b437f0b50fd01..9a38c1ab50f7fdb27588c7451ade19c166e69c96 100644
--- a/opcodes/aarch64-dis.h
+++ b/opcodes/aarch64-dis.h
@@ -124,6 +124,8 @@ AARCH64_DECL_OPD_EXTRACTOR (ext_sve_strided_reglist);
AARCH64_DECL_OPD_EXTRACTOR (ext_sve_scale);
AARCH64_DECL_OPD_EXTRACTOR (ext_sve_shlimm);
AARCH64_DECL_OPD_EXTRACTOR (ext_sve_shrimm);
+AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_vrs1);
+AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_vrs2);
AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_hv_tiles);
AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_hv_tiles_range);
AARCH64_DECL_OPD_EXTRACTOR (ext_sme_za_list);
diff --git a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c
index 7e088a93c107b6152b2df1bb622516d09fce3839..a14b2ca02d2c0fcf6c94f5bb0c587a7168594a5b 100644
--- a/opcodes/aarch64-dis.c
+++ b/opcodes/aarch64-dis.c
@@ -1929,6 +1929,84 @@ aarch64_ext_sme_za_array (const aarch64_operand *self,
return true;
}
+/* Decode two ZA tile slice (V, Rv, off3| ZAn ,off2 | ZAn, ol| ZAn) feilds. */
+bool
+aarch64_ext_sme_za_vrs1 (const aarch64_operand *self,
+ aarch64_opnd_info *info, aarch64_insn code,
+ const aarch64_inst *inst,
+ aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+ int v = extract_field (self->fields[0], code, 0);
+ int regno = 12 + extract_field (self->fields[1], code, 0);
+ int imm, za_reg, num_offset = 2;
+
+ switch (info->qualifier)
+ {
+ case AARCH64_OPND_QLF_S_B:
+ imm = extract_field (self->fields[2], code, 0);
+ info->indexed_za.index.imm = imm * num_offset;
+ break;
+ case AARCH64_OPND_QLF_S_H:
+ case AARCH64_OPND_QLF_S_S:
+ za_reg = extract_field (self->fields[2], code, 0);
+ imm = extract_field (self->fields[3], code, 0);
+ info->indexed_za.index.imm = imm * num_offset;
+ info->indexed_za.regno = za_reg;
+ break;
+ case AARCH64_OPND_QLF_S_D:
+ za_reg = extract_field (self->fields[2], code, 0);
+ info->indexed_za.regno = za_reg;
+ break;
+ default:
+ return false;
+ }
+
+ info->indexed_za.index.regno = regno;
+ info->indexed_za.index.countm1 = num_offset - 1;
+ info->indexed_za.v = v;
+ info->indexed_za.group_size = get_opcode_dependent_value (inst->opcode);
+ return true;
+}
+
+/* Decode four ZA tile slice (V, Rv, off3| ZAn ,off2 | ZAn, ol| ZAn) feilds. */
+bool
+aarch64_ext_sme_za_vrs2 (const aarch64_operand *self,
+ aarch64_opnd_info *info, aarch64_insn code,
+ const aarch64_inst *inst,
+ aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+ int v = extract_field (self->fields[0], code, 0);
+ int regno = 12 + extract_field (self->fields[1], code, 0);
+ int imm, za_reg, num_offset =4;
+
+ switch (info->qualifier)
+ {
+ case AARCH64_OPND_QLF_S_B:
+ imm = extract_field (self->fields[2], code, 0);
+ info->indexed_za.index.imm = imm * num_offset;
+ break;
+ case AARCH64_OPND_QLF_S_H:
+ za_reg = extract_field (self->fields[2], code, 0);
+ imm = extract_field (self->fields[3], code, 0);
+ info->indexed_za.index.imm = imm * num_offset;
+ info->indexed_za.regno = za_reg;
+ break;
+ case AARCH64_OPND_QLF_S_S:
+ case AARCH64_OPND_QLF_S_D:
+ za_reg = extract_field (self->fields[2], code, 0);
+ info->indexed_za.regno = za_reg;
+ break;
+ default:
+ return false;
+ }
+
+ info->indexed_za.index.regno = regno;
+ info->indexed_za.index.countm1 = num_offset - 1;
+ info->indexed_za.v = v;
+ info->indexed_za.group_size = get_opcode_dependent_value (inst->opcode);
+ return true;
+}
+
bool
aarch64_ext_sme_addr_ri_u4xvl (const aarch64_operand *self,
aarch64_opnd_info *info, aarch64_insn code,
@@ -3160,6 +3238,7 @@ aarch64_decode_variant_using_iclass (aarch64_inst *inst)
variant = 3;
break;
+ case sme2_movaz:
case sme_misc:
case sve_misc:
/* These instructions have only a single variant. */
diff --git a/opcodes/aarch64-opc.h b/opcodes/aarch64-opc.h
index f193a90ecc5993645fed573a9a0670aeb4fbf781..587775152e3ef26cb3d09e138eda2791b95cb5d9 100644
--- a/opcodes/aarch64-opc.h
+++ b/opcodes/aarch64-opc.h
@@ -210,6 +210,13 @@ enum aarch64_field_kind
FLD_sz,
FLD_type,
FLD_vldst_size,
+ FLD_off3,
+ FLD_off2,
+ FLD_ZAn_1,
+ FLD_ol,
+ FLD_ZAn_2,
+ FLD_ZAn_3,
+ FLD_ZAn
};
/* Field description. */
diff --git a/opcodes/aarch64-opc.c b/opcodes/aarch64-opc.c
index e3ad32f5a1e070fe1cc464e1c0df2b0f4347f45f..cf76871930f9f4e8613a977efb81464dce3d8ba7 100644
--- a/opcodes/aarch64-opc.c
+++ b/opcodes/aarch64-opc.c
@@ -400,6 +400,16 @@ const aarch64_field fields[] =
{ 22, 1 }, /* sz: 1-bit element size select. */
{ 22, 2 }, /* type: floating point type field in fp data inst. */
{ 10, 2 }, /* vldst_size: size field in the AdvSIMD load/store inst. */
+ { 5, 3 }, /* off3: immediate offset used to calculate slice number in a
+ ZA tile. */
+ { 5, 2 }, /* off2: immediate offset used to calculate slice number in
+ a ZA tile. */
+ { 7, 1 }, /* ZAn_1: name of the 1bit encoded ZA tile. */
+ { 5, 1 }, /* ol: immediate offset used to calculate slice number in a ZA
+ tile. */
+ { 6, 2 }, /* ZAn_2: name of the 2bit encoded ZA tile. */
+ { 5, 3 }, /* ZAn_3: name of the 3bit encoded ZA tile. */
+ { 6, 1 }, /* ZAn: name of the bit encoded ZA tile. */
};
enum aarch64_operand_class
@@ -1938,6 +1948,49 @@ operand_general_constraint_met_p (const aarch64_opnd_info *opnds, int idx,
return 0;
break;
+ case AARCH64_OPND_SME_ZA_array_vrsb_1:
+ if (!check_za_access (opnd, mismatch_detail, idx, 12, 7, 2,
+ get_opcode_dependent_value (opcode)))
+ return 0;
+ break;
+
+ case AARCH64_OPND_SME_ZA_array_vrsh_1:
+ if (!check_za_access (opnd, mismatch_detail, idx, 12, 3, 2,
+ get_opcode_dependent_value (opcode)))
+ return 0;
+ break;
+
+ case AARCH64_OPND_SME_ZA_array_vrss_1:
+ if (!check_za_access (opnd, mismatch_detail, idx, 12, 1, 2,
+ get_opcode_dependent_value (opcode)))
+ return 0;
+ break;
+
+ case AARCH64_OPND_SME_ZA_array_vrsd_1:
+ if (!check_za_access (opnd, mismatch_detail, idx, 12, 0, 2,
+ get_opcode_dependent_value (opcode)))
+ return 0;
+ break;
+
+ case AARCH64_OPND_SME_ZA_array_vrsb_2:
+ if (!check_za_access (opnd, mismatch_detail, idx, 12, 3, 4,
+ get_opcode_dependent_value (opcode)))
+ return 0;
+ break;
+
+ case AARCH64_OPND_SME_ZA_array_vrsh_2:
+ if (!check_za_access (opnd, mismatch_detail, idx, 12, 1, 4,
+ get_opcode_dependent_value (opcode)))
+ return 0;
+ break;
+
+ case AARCH64_OPND_SME_ZA_array_vrss_2:
+ case AARCH64_OPND_SME_ZA_array_vrsd_2:
+ if (!check_za_access (opnd, mismatch_detail, idx, 12, 0, 4,
+ get_opcode_dependent_value (opcode)))
+ return 0;
+ break;
+
case AARCH64_OPND_SME_ZA_HV_idx_srcxN:
case AARCH64_OPND_SME_ZA_HV_idx_destxN:
size = aarch64_get_qualifier_esize (opnd->qualifier);
@@ -4103,6 +4156,30 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
? style_sub_mnem (styler, "vgx4") : "");
break;
+ case AARCH64_OPND_SME_ZA_array_vrsb_1:
+ case AARCH64_OPND_SME_ZA_array_vrsh_1:
+ case AARCH64_OPND_SME_ZA_array_vrss_1:
+ case AARCH64_OPND_SME_ZA_array_vrsd_1:
+ case AARCH64_OPND_SME_ZA_array_vrsb_2:
+ case AARCH64_OPND_SME_ZA_array_vrsh_2:
+ case AARCH64_OPND_SME_ZA_array_vrss_2:
+ case AARCH64_OPND_SME_ZA_array_vrsd_2:
+ snprintf (buf, size, "%s [%s, %s%s%s]",
+ style_reg (styler, "za%d%c%s%s",
+ opnd->indexed_za.regno,
+ opnd->indexed_za.v ? 'v': 'h',
+ opnd->qualifier == AARCH64_OPND_QLF_NIL ? "" : ".",
+ (opnd->qualifier == AARCH64_OPND_QLF_NIL
+ ? ""
+ : aarch64_get_qualifier_name (opnd->qualifier))),
+ style_reg (styler, "w%d", opnd->indexed_za.index.regno),
+ style_imm (styler, "%" PRIi64, opnd->indexed_za.index.imm),
+ opnd->indexed_za.index.countm1 ? ":" : "",
+ opnd->indexed_za.index.countm1 ? style_imm (styler, "%d",
+ opnd->indexed_za.index.imm
+ + opnd->indexed_za.index.countm1):"");
+ break;
+
case AARCH64_OPND_SME_SM_ZA:
snprintf (buf, size, "%s",
style_reg (styler, opnd->reg.regno == 's' ? "sm" : "za"));
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index a8ccdafd044efd62d11ba1e4c199792f6dd44559..9c7648b0a6df5444cc89f52aef3d455e624eedbb 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1497,6 +1497,10 @@
{ \
QLF2(S_B,S_B), \
}
+#define OP_SVE_HH \
+{ \
+ QLF2(S_H,S_H), \
+}
#define OP_SVE_BBU \
{ \
QLF3(S_B,S_B,NIL), \
@@ -2614,6 +2618,8 @@ static const aarch64_feature_set aarch64_feature_d128_the =
AARCH64_FEATURES (2, D128, THE);
static const aarch64_feature_set aarch64_feature_b16b16 =
AARCH64_FEATURE (B16B16);
+static const aarch64_feature_set aarch64_feature_sme2p1 =
+ AARCH64_FEATURE (SME2p1);
#define CORE &aarch64_feature_v8
#define FP &aarch64_feature_fp
@@ -2677,6 +2683,7 @@ static const aarch64_feature_set aarch64_feature_b16b16 =
#define THE &aarch64_feature_the
#define D128_THE &aarch64_feature_d128_the
#define B16B16 &aarch64_feature_b16b16
+#define SME2p1 &aarch64_feature_sme2p1
#define CORE_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS) \
{ NAME, OPCODE, MASK, CLASS, OP, CORE, OPS, QUALS, FLAGS, 0, 0, NULL }
@@ -2743,6 +2750,9 @@ static const aarch64_feature_set aarch64_feature_b16b16 =
#define SVE2_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
{ NAME, OPCODE, MASK, CLASS, OP, SVE2, OPS, QUALS, \
FLAGS | F_STRICT, 0, TIED, NULL }
+#define SME2p1_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
+ { NAME, OPCODE, MASK, CLASS, OP, SME2p1, OPS, QUALS, \
+ FLAGS | F_STRICT, 0, TIED, NULL }
#define SVE2_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
{ NAME, OPCODE, MASK, CLASS, OP, SVE2, OPS, QUALS, \
FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
@@ -6289,6 +6299,16 @@ const struct aarch64_opcode aarch64_opcode_table[] =
B16B16_INSN("bfmls", 0x64200c00, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
B16B16_INSN("bfmul", 0x64202800, 0xffa0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn, SVE_Zm3_11_INDEX), OP_SVE_VVV_H, 0, 0),
+/* SME2.1 movaz instructions. */
+ SME2p1_INSN ("movaz", 0xc0060600, 0xffff1f83, sme2_movaz, 0, OP2 (SME_Zdnx4, SME_ZA_array_vrsb_2), OP_SVE_BB, 0, 0),
+ SME2p1_INSN ("movaz", 0xc0460600, 0xffff1f83, sme2_movaz, 0, OP2 (SME_Zdnx4, SME_ZA_array_vrsh_2), OP_SVE_HH, 0, 0),
+ SME2p1_INSN ("movaz", 0xc0860600, 0xffff1f83, sme2_movaz, 0, OP2 (SME_Zdnx4, SME_ZA_array_vrss_2), OP_SVE_SS, 0, 0),
+ SME2p1_INSN ("movaz", 0xc0c60600, 0xffff1f03, sme2_movaz, 0, OP2 (SME_Zdnx4, SME_ZA_array_vrsd_2), OP_SVE_DD, 0, 0),
+
+ SME2p1_INSN ("movaz", 0xc0060200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsb_1), OP_SVE_BB, 0, 0),
+ SME2p1_INSN ("movaz", 0xc0460200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsh_1), OP_SVE_HH, 0, 0),
+ SME2p1_INSN ("movaz", 0xc0860200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrss_1), OP_SVE_SS, 0, 0),
+ SME2p1_INSN ("movaz", 0xc0c60200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsd_1), OP_SVE_DD, 0, 0),
{0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
};
@@ -6726,6 +6746,22 @@ const struct aarch64_opcode aarch64_opcode_table[] =
Y(SIMD_REG, regno, "SVE_Vd", 0, F(FLD_SVE_Vd), "a SIMD register") \
Y(SIMD_REG, regno, "SVE_Vm", 0, F(FLD_SVE_Vm), "a SIMD register") \
Y(SIMD_REG, regno, "SVE_Vn", 0, F(FLD_SVE_Vn), "a SIMD register") \
+ Y(ZA_ACCESS, sme_za_vrs1, "SME_ZA_array_vrsb_1", 0, \
+ F(FLD_SME_V,FLD_SME_Rv,FLD_off3), "ZA0 tile") \
+ Y(ZA_ACCESS, sme_za_vrs1, "SME_ZA_array_vrsh_1", 0, \
+ F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn_1,FLD_off2), "1 bit ZA tile") \
+ Y(ZA_ACCESS, sme_za_vrs1, "SME_ZA_array_vrss_1", 0, \
+ F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn_2,FLD_ol), "2 ZA tile") \
+ Y(ZA_ACCESS, sme_za_vrs1, "SME_ZA_array_vrsd_1", 0, \
+ F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn_3), "3 ZA tile") \
+ Y(ZA_ACCESS, sme_za_vrs2, "SME_ZA_array_vrsb_2", 0, \
+ F(FLD_SME_V,FLD_SME_Rv,FLD_off2), "ZA0 tile") \
+ Y(ZA_ACCESS, sme_za_vrs2, "SME_ZA_array_vrsh_2", 0, \
+ F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn,FLD_ol), "1 bit ZA tile") \
+ Y(ZA_ACCESS, sme_za_vrs2, "SME_ZA_array_vrss_2", 0, \
+ F(FLD_SME_V,FLD_SME_Rv,FLD_off2), "2 bit ZA tile") \
+ Y(ZA_ACCESS, sme_za_vrs2, "SME_ZA_array_vrsd_2", 0, \
+ F(FLD_SME_V,FLD_SME_Rv,FLD_ZAn_3), "3 bit ZA tile") \
Y(SVE_REG, regno, "SVE_Za_5", 0, F(FLD_SVE_Za_5), \
"an SVE vector register") \
Y(SVE_REG, regno, "SVE_Za_16", 0, F(FLD_SVE_Za_16), \
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3/6][Binutils] aarch64: Add support for FEAT_SVE2p1.
2024-01-15 9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
@ 2024-01-15 9:35 ` Srinath Parvathaneni
0 siblings, 0 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15 9:35 UTC (permalink / raw)
To: binutils; +Cc: richard.earnshaw, nickc
[-- Attachment #1: Type: text/plain, Size: 358 bytes --]
Hi,
This patch add support for FEAT_SVE2p1 (SVE2.1 Extension) feature
along with +sve2p1 optional flag to enabe this feature.
Also support for following SVE2p1 instructions is added
addqv, andqv, smaxqv, sminqv, umaxqv, uminqv and uminqv.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
[-- Attachment #2: 3_6.patch --]
[-- Type: text/x-patch, Size: 15176 bytes --]
diff --git a/gas/NEWS b/gas/NEWS
index d2c5c0641c4392d66472e535e5f51756fc5d511f..6b2282a393b8ae7046381935c5a9263879a7893f 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,8 @@
-*- text -*-
+* Add support for the Arm Scalable Vector Extension version 2.1 (SVE2.1)
+ instructions.
+
* Add support for the AArch64 Scalable Matrix Extension version 2.1 (SME2.1)
instructions.
diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 34159c2168b78fe12d4a549678ae77be88e50313..04dd08a6fa71b84b3e71e0ab422fb6deb9fedb38 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -10354,6 +10354,7 @@ static const struct aarch64_option_cpu_value_table aarch64_features[] = {
AARCH64_FEATURE (LSE128)},
{"b16b16", AARCH64_FEATURE (B16B16), AARCH64_FEATURE (SVE2)},
{"sme2p1", AARCH64_FEATURE (SME2p1), AARCH64_FEATURE (SME2)},
+ {"sve2p1", AARCH64_FEATURE (SVE2p1), AARCH64_FEATURE (SVE2)},
{NULL, AARCH64_NO_FEATURES, AARCH64_NO_FEATURES},
};
diff --git a/gas/doc/c-aarch64.texi b/gas/doc/c-aarch64.texi
index 1f3a4fbcafbab652d70f202b33527ae3dde7e1bb..7a8da72c24a452b1867656acfd1ee1af3e4b56d5 100644
--- a/gas/doc/c-aarch64.texi
+++ b/gas/doc/c-aarch64.texi
@@ -278,6 +278,8 @@ automatically cause those extensions to be disabled.
@tab Enable the 128-bit Page Descriptor Extension. This implies @code{lse128}.
@item @code{sme2p1} @tab N/A @tab No
@tab Enable the SME2.1 Extension.
+@item @code{sve2p1} @tab N/A @tab No
+ @tab Enable the SVE2.1 Extension.
@end multitable
@node AArch64 Syntax
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.d b/gas/testsuite/gas/aarch64/sve2p1-1-bad.d
new file mode 100644
index 0000000000000000000000000000000000000000..a2ca49ef487563a55ae8c26ca4318e68da850e64
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.d
@@ -0,0 +1,4 @@
+#name: Illegal test of SVE2.1 min max instructions.
+#as: -march=armv9.4-a
+#source: sve2p1-1.s
+#error_output: sve2p1-1-bad.l
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
new file mode 100644
index 0000000000000000000000000000000000000000..6b07eee9e94d93a9e8d6357a741d2d6ef90601e0
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
@@ -0,0 +1,37 @@
+.*: Assembler messages:
+.*: Error: selected processor does not support `addqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `addqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `addqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `addqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `addqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `addqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `andqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `andqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `andqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `andqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `andqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `andqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `smaxqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `smaxqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `smaxqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `smaxqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `smaxqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `smaxqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `umaxqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `umaxqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `umaxqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `umaxqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `umaxqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `umaxqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `sminqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `sminqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `sminqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `sminqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `sminqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `sminqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `uminqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `uminqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `uminqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `uminqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `uminqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `uminqv v16.4s,p7,z0.s'
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.d b/gas/testsuite/gas/aarch64/sve2p1-1.d
new file mode 100644
index 0000000000000000000000000000000000000000..d3d14f3c455aa352d31e01195196e03397ed4271
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.d
@@ -0,0 +1,46 @@
+#name: Test of SVE2.1 min max instructions.
+#as: -march=armv9.4-a+sve2p1
+#objdump: -dr
+
+[^:]+: file format .*
+
+
+[^:]+:
+
+[^:]+:
+.*: 04052200 addqv v0.16b, p0, z16.b
+.*: 04452501 addqv v1.8h, p1, z8.h
+.*: 04852882 addqv v2.4s, p2, z4.s
+.*: 04c52c44 addqv v4.2d, p3, z2.d
+.*: 04c53028 addqv v8.2d, p4, z1.d
+.*: 04853c10 addqv v16.4s, p7, z0.s
+.*: 041e2200 andqv v0.16b, p0, z16.b
+.*: 045e2501 andqv v1.8h, p1, z8.h
+.*: 049e2882 andqv v2.4s, p2, z4.s
+.*: 04de2c44 andqv v4.2d, p3, z2.d
+.*: 04de3028 andqv v8.2d, p4, z1.d
+.*: 049e3c10 andqv v16.4s, p7, z0.s
+.*: 040c2200 smaxqv v0.16b, p0, z16.b
+.*: 044c2501 smaxqv v1.8h, p1, z8.h
+.*: 048c2882 smaxqv v2.4s, p2, z4.s
+.*: 04cc2c44 smaxqv v4.2d, p3, z2.d
+.*: 04cc3028 smaxqv v8.2d, p4, z1.d
+.*: 048c3c10 smaxqv v16.4s, p7, z0.s
+.*: 040d2200 umaxqv v0.16b, p0, z16.b
+.*: 044d2501 umaxqv v1.8h, p1, z8.h
+.*: 048d2882 umaxqv v2.4s, p2, z4.s
+.*: 04cd2c44 umaxqv v4.2d, p3, z2.d
+.*: 04cd3028 umaxqv v8.2d, p4, z1.d
+.*: 048d3c10 umaxqv v16.4s, p7, z0.s
+.*: 040e2200 sminqv v0.16b, p0, z16.b
+.*: 044e2501 sminqv v1.8h, p1, z8.h
+.*: 048e2882 sminqv v2.4s, p2, z4.s
+.*: 04ce2c44 sminqv v4.2d, p3, z2.d
+.*: 04ce3028 sminqv v8.2d, p4, z1.d
+.*: 048e3c10 sminqv v16.4s, p7, z0.s
+.*: 040f2200 uminqv v0.16b, p0, z16.b
+.*: 044f2501 uminqv v1.8h, p1, z8.h
+.*: 048f2882 uminqv v2.4s, p2, z4.s
+.*: 04cf2c44 uminqv v4.2d, p3, z2.d
+.*: 04cf3028 uminqv v8.2d, p4, z1.d
+.*: 048f3c10 uminqv v16.4s, p7, z0.s
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.s b/gas/testsuite/gas/aarch64/sve2p1-1.s
new file mode 100644
index 0000000000000000000000000000000000000000..c56ebf856ab5815efa01e06f40d04360f8afc7bc
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.s
@@ -0,0 +1,41 @@
+addqv v0.16b, p0, z16.b
+addqv v1.8h, p1, z8.h
+addqv v2.4s, p2, z4.s
+addqv v4.2d, p3, z2.d
+addqv v8.2d, p4, z1.d
+addqv v16.4s, p7, z0.s
+
+andqv v0.16b, p0, z16.b
+andqv v1.8h, p1, z8.h
+andqv v2.4s, p2, z4.s
+andqv v4.2d, p3, z2.d
+andqv v8.2d, p4, z1.d
+andqv v16.4s, p7, z0.s
+
+smaxqv v0.16b, p0, z16.b
+smaxqv v1.8h, p1, z8.h
+smaxqv v2.4s, p2, z4.s
+smaxqv v4.2d, p3, z2.d
+smaxqv v8.2d, p4, z1.d
+smaxqv v16.4s, p7, z0.s
+
+umaxqv v0.16b, p0, z16.b
+umaxqv v1.8h, p1, z8.h
+umaxqv v2.4s, p2, z4.s
+umaxqv v4.2d, p3, z2.d
+umaxqv v8.2d, p4, z1.d
+umaxqv v16.4s, p7, z0.s
+
+sminqv v0.16b, p0, z16.b
+sminqv v1.8h, p1, z8.h
+sminqv v2.4s, p2, z4.s
+sminqv v4.2d, p3, z2.d
+sminqv v8.2d, p4, z1.d
+sminqv v16.4s, p7, z0.s
+
+uminqv v0.16b, p0, z16.b
+uminqv v1.8h, p1, z8.h
+uminqv v2.4s, p2, z4.s
+uminqv v4.2d, p3, z2.d
+uminqv v8.2d, p4, z1.d
+uminqv v16.4s, p7, z0.s
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index 648e25f3e4242bb738eee5f62079838784223b8a..1af49c406e06e79ba81a1f01887d43da37d8a625 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -226,6 +226,8 @@ enum aarch64_feature_bit {
AARCH64_FEATURE_B16B16,
/* SME2.1 instructions. */
AARCH64_FEATURE_SME2p1,
+ /* SVE2.1 instructions. */
+ AARCH64_FEATURE_SVE2p1,
AARCH64_NUM_FEATURES
};
@@ -1000,6 +1002,7 @@ enum aarch64_insn_class
cssc,
gcs,
the,
+ sve2_urqvs
};
/* Opcode enumerators. */
@@ -1272,7 +1275,9 @@ extern const aarch64_opcode aarch64_opcode_table[];
allow. This impacts the constraintts on assembly but yelds no
impact on disassembly. */
#define F_OPD_NARROW (1ULL << 33)
-/* Next bit is 34. */
+/* For the instruction with size[22:23] field. */
+#define F_OPD_SIZE (1ULL << 34)
+/* Next bit is 35. */
/* Instruction constraints. */
/* This instruction has a predication constraint on the instruction at PC+4. */
@@ -1339,7 +1344,8 @@ static inline bool
opcode_has_special_coder (const aarch64_opcode *opcode)
{
return (opcode->flags & (F_SF | F_LSE_SZ | F_SIZEQ | F_FPTYPE | F_SSIZE | F_T
- | F_GPRSIZE_IN_Q | F_LDS_SIZE | F_MISC | F_N | F_COND)) != 0;
+ | F_GPRSIZE_IN_Q | F_LDS_SIZE | F_MISC | F_N | F_COND
+ | F_OPD_SIZE)) != 0;
}
\f
struct aarch64_name_value_pair
diff --git a/opcodes/aarch64-asm.c b/opcodes/aarch64-asm.c
index 3fac127a5899077e2ac19c5e98df737b8ffbe147..1dfd59df42dbbe5640ece7c83f43f027a8329d06 100644
--- a/opcodes/aarch64-asm.c
+++ b/opcodes/aarch64-asm.c
@@ -1981,6 +1981,20 @@ do_special_encoding (struct aarch64_inst *inst)
gen_sub_field (FLD_imm5, 0, num + 1, &field);
insert_field_2 (&field, &inst->value, 1 << num, inst->opcode->mask);
}
+
+ if ((inst->opcode->flags & F_OPD_SIZE) && inst->opcode->iclass == sve2_urqvs)
+ {
+ enum aarch64_opnd_qualifier qualifier[1];
+ aarch64_insn value1 = 0;
+ idx = 0;
+ qualifier[0] = inst->operands[idx].qualifier;
+ qualifier[1] = inst->operands[idx+2].qualifier;
+ value = aarch64_get_qualifier_standard_value (qualifier[0]);
+ value1 = aarch64_get_qualifier_standard_value (qualifier[1]);
+ assert ((value >> 1) == value1);
+ insert_field (FLD_size, &inst->value, value1, inst->opcode->mask);
+ }
+
if (inst->opcode->flags & F_GPRSIZE_IN_Q)
{
/* Use Rt to encode in the case of e.g.
diff --git a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c
index a14b2ca02d2c0fcf6c94f5bb0c587a7168594a5b..d395438966f16d1fc0fa7117a434cff50901f96e 100644
--- a/opcodes/aarch64-dis.c
+++ b/opcodes/aarch64-dis.c
@@ -2609,6 +2609,16 @@ do_special_decoding (aarch64_inst *inst)
get_vreg_qualifier_from_value ((num << 1) | Q);
}
+ if ((inst->opcode->flags & F_OPD_SIZE) && inst->opcode->iclass == sve2_urqvs)
+ {
+ unsigned size;
+ size = (unsigned) extract_field (FLD_size, inst->value,
+ inst->opcode->mask);
+ inst->operands[0].qualifier
+ = get_vreg_qualifier_from_value (1 + (size << 1));
+ inst->operands[2].qualifier = get_sreg_qualifier_from_value (size);
+ }
+
if (inst->opcode->flags & F_GPRSIZE_IN_Q)
{
/* Use Rt to encode in the case of e.g.
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index 9c7648b0a6df5444cc89f52aef3d455e624eedbb..f433257634e72b6afb64d58a1f0f052164291033 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1487,6 +1487,10 @@
- P: the operand has a /[ZM] suffix and the choice of suffix is not
the same for all variants.
+ - v: the operand has a V_[16B|8H|4S|2D] qualifier and the choice of
+ qualifier suffix is not the same for all variants. This is used for
+ the same kinds of operands as [BHSD] above.
+
The _<sizes>, if present, give the subset of [BHSD] that are accepted
by the V entries in <operands>. */
#define OP_SVE_B \
@@ -1911,6 +1915,13 @@
QLF3(S_S,S_H,NIL), \
QLF3(S_D,S_S,NIL), \
}
+#define OP_SVE_vUS_BHSD_BHSD \
+{ \
+ QLF3(V_16B,NIL,S_B), \
+ QLF3(V_8H,NIL,S_H), \
+ QLF3(V_4S,NIL,S_S), \
+ QLF3(V_2D,NIL,S_D), \
+}
#define OP_SVE_VMV_SD \
{ \
QLF3(S_S,P_M,S_S), \
@@ -2620,6 +2631,8 @@ static const aarch64_feature_set aarch64_feature_b16b16 =
AARCH64_FEATURE (B16B16);
static const aarch64_feature_set aarch64_feature_sme2p1 =
AARCH64_FEATURE (SME2p1);
+static const aarch64_feature_set aarch64_feature_sve2p1 =
+ AARCH64_FEATURE (SVE2p1);
#define CORE &aarch64_feature_v8
#define FP &aarch64_feature_fp
@@ -2684,6 +2697,7 @@ static const aarch64_feature_set aarch64_feature_sme2p1 =
#define D128_THE &aarch64_feature_d128_the
#define B16B16 &aarch64_feature_b16b16
#define SME2p1 &aarch64_feature_sme2p1
+#define SVE2p1 &aarch64_feature_sve2p1
#define CORE_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS) \
{ NAME, OPCODE, MASK, CLASS, OP, CORE, OPS, QUALS, FLAGS, 0, 0, NULL }
@@ -2762,6 +2776,12 @@ static const aarch64_feature_set aarch64_feature_sme2p1 =
#define B16B16_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
{ NAME, OPCODE, MASK, CLASS, OP, B16B16, OPS, QUALS, \
FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
+#define SVE2p1_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
+ { NAME, OPCODE, MASK, CLASS, OP, SVE2p1, OPS, QUALS, \
+ FLAGS | F_STRICT, 0, TIED, NULL }
+#define SVE2p1_INSNC(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,CONSTRAINTS,TIED) \
+ { NAME, OPCODE, MASK, CLASS, OP, SVE2p1, OPS, QUALS, \
+ FLAGS | F_STRICT, CONSTRAINTS, TIED, NULL }
#define SVE2AES_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS,TIED) \
{ NAME, OPCODE, MASK, CLASS, OP, SVE2_AES, OPS, QUALS, \
FLAGS | F_STRICT, 0, TIED, NULL }
@@ -6309,6 +6329,15 @@ const struct aarch64_opcode aarch64_opcode_table[] =
SME2p1_INSN ("movaz", 0xc0460200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsh_1), OP_SVE_HH, 0, 0),
SME2p1_INSN ("movaz", 0xc0860200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrss_1), OP_SVE_SS, 0, 0),
SME2p1_INSN ("movaz", 0xc0c60200, 0xffff1f01, sme2_movaz, 0, OP2 (SME_Zdnx2, SME_ZA_array_vrsd_1), OP_SVE_DD, 0, 0),
+
+/* SVE2p1 Instructions. */
+ SVE2p1_INSNC("addqv",0x04052000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("andqv",0x041e2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("smaxqv",0x040c2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("sminqv",0x040e2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("umaxqv",0x040d2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("uminqv",0x040f2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+
{0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
};
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 4/6][Binutils] aarch64: Add SVE2.1 dupq, eorqv and extq instructions.
2024-01-15 9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
2024-01-15 9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
@ 2024-01-15 9:37 ` Srinath Parvathaneni
2024-01-15 9:38 ` PATCH 5/6][Binutils] aarch64: Add SVE2.1 fmin and fmax instructions Srinath Parvathaneni
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15 9:37 UTC (permalink / raw)
To: binutils; +Cc: richard.earnshaw, nickc
[-- Attachment #1: Type: text/plain, Size: 190 bytes --]
Hi,
This patch add support for SVE2.1 instruction dupq, eorqv and extq.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
[-- Attachment #2: 4_6.patch --]
[-- Type: text/x-patch, Size: 13581 bytes --]
diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 04dd08a6fa71b84b3e71e0ab422fb6deb9fedb38..0665732fe03cc59df4ebd36ee1afbad08c22b72e 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -6698,6 +6698,8 @@ parse_operands (char *str, const aarch64_opcode *opcode)
case AARCH64_OPND_SVE_Zm4_11_INDEX:
case AARCH64_OPND_SVE_Zm4_INDEX:
case AARCH64_OPND_SVE_Zn_INDEX:
+ case AARCH64_OPND_SVE_Zm_imm4:
+ case AARCH64_OPND_SVE_Zn_5_INDEX:
case AARCH64_OPND_SME_Zm_INDEX1:
case AARCH64_OPND_SME_Zm_INDEX2:
case AARCH64_OPND_SME_Zm_INDEX3_1:
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
index 6b07eee9e94d93a9e8d6357a741d2d6ef90601e0..f5a80d26768882f2b2e16840ad587612d34ae15e 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
@@ -35,3 +35,23 @@
.*: Error: selected processor does not support `uminqv v4.2d,p3,z2.d'
.*: Error: selected processor does not support `uminqv v8.2d,p4,z1.d'
.*: Error: selected processor does not support `uminqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `dupq z10.b,z20.b\[0\]'
+.*: Error: selected processor does not support `dupq z10.b,z20.b\[15\]'
+.*: Error: selected processor does not support `dupq z10.h,z20.h\[0\]'
+.*: Error: selected processor does not support `dupq z10.h,z20.h\[7\]'
+.*: Error: selected processor does not support `dupq z10.s,z20.s\[0\]'
+.*: Error: selected processor does not support `dupq z10.s,z20.s\[3\]'
+.*: Error: selected processor does not support `dupq z10.d,z20.d\[0\]'
+.*: Error: selected processor does not support `dupq z10.d,z20.d\[1\]'
+.*: Error: selected processor does not support `eorqv v0.16b,p0,z16.b'
+.*: Error: selected processor does not support `eorqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `eorqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `eorqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `eorqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `eorqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `extq z0.b,z0.b,z10.b\[15\]'
+.*: Error: selected processor does not support `extq z1.b,z1.b,z15.b\[7\]'
+.*: Error: selected processor does not support `extq z2.b,z2.b,z5.b\[3\]'
+.*: Error: selected processor does not support `extq z4.b,z4.b,z12.b\[1\]'
+.*: Error: selected processor does not support `extq z8.b,z8.b,z7.b\[4\]'
+.*: Error: selected processor does not support `extq z16.b,z16.b,z1.b\[8\]'
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.d b/gas/testsuite/gas/aarch64/sve2p1-1.d
index d3d14f3c455aa352d31e01195196e03397ed4271..6d718aec7cad5511bda282865d0667a6cdaa188d 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.d
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.d
@@ -44,3 +44,23 @@
.*: 04cf2c44 uminqv v4.2d, p3, z2.d
.*: 04cf3028 uminqv v8.2d, p4, z1.d
.*: 048f3c10 uminqv v16.4s, p7, z0.s
+.*: 0530268a dupq z10.b, z20.b\[0\]
+.*: 053f268a dupq z10.b, z20.b\[15\]
+.*: 0521268a dupq z10.h, z20.h\[0\]
+.*: 052f268a dupq z10.h, z20.h\[7\]
+.*: 0522268a dupq z10.s, z20.s\[0\]
+.*: 052e268a dupq z10.s, z20.s\[3\]
+.*: 0524268a dupq z10.d, z20.d\[0\]
+.*: 052c268a dupq z10.d, z20.d\[1\]
+.*: 041d2200 eorqv v0.16b, p0, z16.b
+.*: 045d2501 eorqv v1.8h, p1, z8.h
+.*: 049d2882 eorqv v2.4s, p2, z4.s
+.*: 04dd2c44 eorqv v4.2d, p3, z2.d
+.*: 04dd3028 eorqv v8.2d, p4, z1.d
+.*: 049d3c10 eorqv v16.4s, p7, z0.s
+.*: 056a27c0 extq z0.b, z0.b, z10.b\[15\]
+.*: 056f25c1 extq z1.b, z1.b, z15.b\[7\]
+.*: 056524c2 extq z2.b, z2.b, z5.b\[3\]
+.*: 056c2444 extq z4.b, z4.b, z12.b\[1\]
+.*: 05672508 extq z8.b, z8.b, z7.b\[4\]
+.*: 05612610 extq z16.b, z16.b, z1.b\[8\]
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.s b/gas/testsuite/gas/aarch64/sve2p1-1.s
index c56ebf856ab5815efa01e06f40d04360f8afc7bc..5278dcf5e67b4cb34ab45b2b2725ab3af14c2594 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.s
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.s
@@ -39,3 +39,25 @@ uminqv v2.4s, p2, z4.s
uminqv v4.2d, p3, z2.d
uminqv v8.2d, p4, z1.d
uminqv v16.4s, p7, z0.s
+dupq z10.b, z20.b[0]
+dupq z10.b, z20.b[15]
+dupq z10.h, z20.h[0]
+dupq z10.h, z20.h[7]
+dupq z10.s, z20.s[0]
+dupq z10.s, z20.s[3]
+dupq z10.d, z20.d[0]
+dupq z10.d, z20.d[1]
+
+eorqv v0.16b, p0, z16.b
+eorqv v1.8h, p1, z8.h
+eorqv v2.4s, p2, z4.s
+eorqv v4.2d, p3, z2.d
+eorqv v8.2d, p4, z1.d
+eorqv v16.4s, p7, z0.s
+
+extq z0.b, z0.b, z10.b[15]
+extq z1.b, z1.b, z15.b[7]
+extq z2.b, z2.b, z5.b[3]
+extq z4.b, z4.b, z12.b[1]
+extq z8.b, z8.b, z7.b[4]
+extq z16.b, z16.b, z1.b[8]
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index 1af49c406e06e79ba81a1f01887d43da37d8a625..de161db75d509b0ac96c604da7bc9743193d23b2 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -727,8 +727,10 @@ enum aarch64_opnd
AARCH64_OPND_SVE_Zm3_19_INDEX, /* z0-z7[0-3] in Zm3_INDEX plus bit 19. */
AARCH64_OPND_SVE_Zm3_22_INDEX, /* z0-z7[0-7] in Zm3_INDEX plus bit 22. */
AARCH64_OPND_SVE_Zm4_11_INDEX, /* z0-z15[0-3] in Zm plus bit 11. */
+ AARCH64_OPND_SVE_Zm_imm4, /* SVE vector register with 4bit index. */
AARCH64_OPND_SVE_Zm4_INDEX, /* z0-z15[0-1] in Zm, bits [20,16]. */
AARCH64_OPND_SVE_Zn, /* SVE vector register in Zn. */
+ AARCH64_OPND_SVE_Zn_5_INDEX, /* Indexed SVE vector register, for DUPQ. */
AARCH64_OPND_SVE_Zn_INDEX, /* Indexed SVE vector register, for DUP. */
AARCH64_OPND_SVE_ZnxN, /* SVE vector register list in Zn. */
AARCH64_OPND_SVE_Zt, /* SVE vector register in Zt. */
@@ -1002,7 +1004,8 @@ enum aarch64_insn_class
cssc,
gcs,
the,
- sve2_urqvs
+ sve2_urqvs,
+ sve_index1,
};
/* Opcode enumerators. */
diff --git a/opcodes/aarch64-asm.h b/opcodes/aarch64-asm.h
index d4b6407dc5de8d6e103ee8ca5b5f2c6bb814647f..e48bf0db8a86149155e325923f6644bb90410ccb 100644
--- a/opcodes/aarch64-asm.h
+++ b/opcodes/aarch64-asm.h
@@ -93,6 +93,7 @@ AARCH64_DECL_OPD_INSERTER (ins_sve_float_half_one);
AARCH64_DECL_OPD_INSERTER (ins_sve_float_half_two);
AARCH64_DECL_OPD_INSERTER (ins_sve_float_zero_one);
AARCH64_DECL_OPD_INSERTER (ins_sve_index);
+AARCH64_DECL_OPD_INSERTER (ins_sve_index_imm);
AARCH64_DECL_OPD_INSERTER (ins_sve_limm_mov);
AARCH64_DECL_OPD_INSERTER (ins_sve_quad_index);
AARCH64_DECL_OPD_INSERTER (ins_sve_reglist);
diff --git a/opcodes/aarch64-asm.c b/opcodes/aarch64-asm.c
index 1dfd59df42dbbe5640ece7c83f43f027a8329d06..0de09f0435abc3b7761707b2c81c58f5f3b1a10e 100644
--- a/opcodes/aarch64-asm.c
+++ b/opcodes/aarch64-asm.c
@@ -1220,6 +1220,21 @@ aarch64_ins_sve_index (const aarch64_operand *self,
return true;
}
+/* Encode Zn.<T>[<imm>], where <imm> is an immediate with range of 0 to one less
+ than the number of elements in 128 bit, which can encode il:tsz. */
+bool
+aarch64_ins_sve_index_imm (const aarch64_operand *self,
+ const aarch64_opnd_info *info, aarch64_insn *code,
+ const aarch64_inst *inst ATTRIBUTE_UNUSED,
+ aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+ insert_field (self->fields[0], code, info->reglane.regno, 0);
+ unsigned int esize = aarch64_get_qualifier_esize (info->qualifier);
+ insert_fields (code, (info->reglane.index * 2 + 1) * esize, 0,
+ 2, self->fields[1],self->fields[2]);
+ return true;
+}
+
/* Encode a logical/bitmask immediate for the MOV alias of SVE DUPM. */
bool
aarch64_ins_sve_limm_mov (const aarch64_operand *self,
@@ -2079,6 +2094,7 @@ aarch64_encode_variant_using_iclass (struct aarch64_inst *inst)
case sme_shift:
case sve_index:
+ case sve_index1:
case sve_shift_pred:
case sve_shift_unpred:
case sve_shift_tsz_hsd:
diff --git a/opcodes/aarch64-dis.h b/opcodes/aarch64-dis.h
index 9a38c1ab50f7fdb27588c7451ade19c166e69c96..30212f2ae2c2759b5667e5a007912d22c4a702fc 100644
--- a/opcodes/aarch64-dis.h
+++ b/opcodes/aarch64-dis.h
@@ -117,6 +117,7 @@ AARCH64_DECL_OPD_EXTRACTOR (ext_sve_float_half_one);
AARCH64_DECL_OPD_EXTRACTOR (ext_sve_float_half_two);
AARCH64_DECL_OPD_EXTRACTOR (ext_sve_float_zero_one);
AARCH64_DECL_OPD_EXTRACTOR (ext_sve_index);
+AARCH64_DECL_OPD_EXTRACTOR (ext_sve_index_imm);
AARCH64_DECL_OPD_EXTRACTOR (ext_sve_limm_mov);
AARCH64_DECL_OPD_EXTRACTOR (ext_sve_quad_index);
AARCH64_DECL_OPD_EXTRACTOR (ext_sve_reglist);
diff --git a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c
index d395438966f16d1fc0fa7117a434cff50901f96e..bffa760004a3ede5c14287ee4db09d6db371bc87 100644
--- a/opcodes/aarch64-dis.c
+++ b/opcodes/aarch64-dis.c
@@ -2097,6 +2097,26 @@ aarch64_ext_sve_index (const aarch64_operand *self,
return true;
}
+/* Decode Zn.<T>[<imm>], where <imm> is an immediate with range of 0 to one less
+ than the number of elements in 128 bit, which can encode il:tsz. */
+bool
+aarch64_ext_sve_index_imm (const aarch64_operand *self,
+ aarch64_opnd_info *info, aarch64_insn code,
+ const aarch64_inst *inst ATTRIBUTE_UNUSED,
+ aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+ int val;
+
+ info->reglane.regno = extract_field (self->fields[0], code, 0);
+ val = extract_fields (code, 0, 2, self->fields[2], self->fields[1]);
+ if ((val & 15) == 0)
+ return 0;
+ while ((val & 1) == 0)
+ val /= 2;
+ info->reglane.index = val / 2;
+ return true;
+}
+
/* Decode a logical immediate for the MOV alias of SVE DUPM. */
bool
aarch64_ext_sve_limm_mov (const aarch64_operand *self,
@@ -3231,6 +3251,17 @@ aarch64_decode_variant_using_iclass (aarch64_inst *inst)
}
break;
+ case sve_index1:
+ i = extract_fields (inst->value, 0, 2, FLD_SVE_tsz, FLD_SVE_i2h);
+ if ((i & 15) == 0)
+ return false;
+ while ((i & 1) == 0)
+ {
+ i >>= 1;
+ variant += 1;
+ }
+ break;
+
case sve_limm:
/* Pick the smallest applicable element size. */
if ((inst->value & 0x20600) == 0x600)
diff --git a/opcodes/aarch64-opc.c b/opcodes/aarch64-opc.c
index cf76871930f9f4e8613a977efb81464dce3d8ba7..1d8ed26c7090e4b73489b15e74a911e33b54555c 100644
--- a/opcodes/aarch64-opc.c
+++ b/opcodes/aarch64-opc.c
@@ -1794,6 +1794,18 @@ operand_general_constraint_met_p (const aarch64_opnd_info *opnds, int idx,
return 0;
break;
+ case AARCH64_OPND_SVE_Zm_imm4:
+ if (!check_reglane (opnd, mismatch_detail, idx, "z", 0, 31, 0, 15))
+ return 0;
+ break;
+
+ case AARCH64_OPND_SVE_Zn_5_INDEX:
+ size = aarch64_get_qualifier_esize (opnd->qualifier);
+ if (!check_reglane (opnd, mismatch_detail, idx, "z", 0, 31,
+ 0, 16 / size - 1))
+ return 0;
+ break;
+
case AARCH64_OPND_SME_PNn3_INDEX1:
case AARCH64_OPND_SME_PNn3_INDEX2:
size = get_operand_field_width (get_operand_from_code (type), 1);
@@ -4074,6 +4086,7 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
case AARCH64_OPND_SME_Zm_INDEX3_1:
case AARCH64_OPND_SME_Zm_INDEX3_2:
case AARCH64_OPND_SME_Zm_INDEX3_10:
+ case AARCH64_OPND_SVE_Zn_5_INDEX:
case AARCH64_OPND_SME_Zm_INDEX4_1:
case AARCH64_OPND_SME_Zm_INDEX4_10:
case AARCH64_OPND_SME_Zn_INDEX1_16:
@@ -4082,6 +4095,7 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
case AARCH64_OPND_SME_Zn_INDEX3_14:
case AARCH64_OPND_SME_Zn_INDEX3_15:
case AARCH64_OPND_SME_Zn_INDEX4_14:
+ case AARCH64_OPND_SVE_Zm_imm4:
snprintf (buf, size, "%s[%s]",
(opnd->qualifier == AARCH64_OPND_QLF_NIL
? style_reg (styler, "z%d", opnd->reglane.regno)
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index f433257634e72b6afb64d58a1f0f052164291033..07f4eb319e9be1a8150224b59aba1ab831e51b29 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -6337,6 +6337,10 @@ const struct aarch64_opcode aarch64_opcode_table[] =
SVE2p1_INSNC("sminqv",0x040e2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
SVE2p1_INSNC("umaxqv",0x040d2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
SVE2p1_INSNC("uminqv",0x040f2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("eorqv",0x041d2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+
+ SVE2p1_INSN("dupq",0x05202400, 0xffe0fc00, sve_index1, 0, OP2 (SVE_Zd, SVE_Zn_5_INDEX), OP_SVE_VV_BHSD, 0, 0),
+ SVE2p1_INSN("extq",0x05602400, 0xfff0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zd, SVE_Zm_imm4), OP_SVE_BBB, 0, 0),
{0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
};
@@ -6816,11 +6820,17 @@ const struct aarch64_opcode aarch64_opcode_table[] =
Y(SVE_REG, sve_quad_index, "SVE_Zm4_11_INDEX", \
4 << OPD_F_OD_LSB, F(FLD_SVE_i2h, FLD_SVE_i3l, FLD_SVE_imm4), \
"an indexed SVE vector register") \
+ Y(SVE_REG, sve_quad_index, "SVE_Zm_imm4", \
+ 5 << OPD_F_OD_LSB, F(FLD_SVE_Zm_5, FLD_SVE_imm4), \
+ "an 4bit indexed SVE vector register") \
Y(SVE_REG, sve_quad_index, "SVE_Zm4_INDEX", \
4 << OPD_F_OD_LSB, F(FLD_SVE_Zm_16), \
"an indexed SVE vector register") \
Y(SVE_REG, regno, "SVE_Zn", 0, F(FLD_SVE_Zn), \
"an SVE vector register") \
+ Y(SVE_REG, sve_index_imm, "SVE_Zn_5_INDEX", 0, \
+ F(FLD_SVE_Zn, FLD_SVE_i2h, FLD_SVE_tsz), \
+ "a 5 bit idexed SVE vector register") \
Y(SVE_REG, sve_index, "SVE_Zn_INDEX", 0, F(FLD_SVE_Zn), \
"an indexed SVE vector register") \
Y(SVE_REGLIST, sve_reglist, "SVE_ZnxN", 0, F(FLD_SVE_Zn), \
^ permalink raw reply [flat|nested] 7+ messages in thread
* PATCH 5/6][Binutils] aarch64: Add SVE2.1 fmin and fmax instructions.
2024-01-15 9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
2024-01-15 9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
2024-01-15 9:37 ` [PATCH 4/6][Binutils] aarch64: Add SVE2.1 dupq, eorqv and extq instructions Srinath Parvathaneni
@ 2024-01-15 9:38 ` Srinath Parvathaneni
2024-01-15 9:40 ` [PATCH 6/6][Binutils] aarch64: Add SVE2.1 Contiguous load/store instructions Srinath Parvathaneni
2024-01-15 11:46 ` [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Nick Clifton
4 siblings, 0 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15 9:38 UTC (permalink / raw)
To: binutils; +Cc: richard.earnshaw, nickc
[-- Attachment #1: Type: text/plain, Size: 215 bytes --]
Hi,
This patch add support for SVE2.1 instruction faddqv,
fmaxnmqv, fmaxqv, fminnmqv and fminqv.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
[-- Attachment #2: 5_6.patch --]
[-- Type: text/x-patch, Size: 6680 bytes --]
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
index f5a80d26768882f2b2e16840ad587612d34ae15e..08aef46de61a6cbbe88ebac77da03ee97c9ebe7c 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
@@ -55,3 +55,28 @@
.*: Error: selected processor does not support `extq z4.b,z4.b,z12.b\[1\]'
.*: Error: selected processor does not support `extq z8.b,z8.b,z7.b\[4\]'
.*: Error: selected processor does not support `extq z16.b,z16.b,z1.b\[8\]'
+.*: Error: selected processor does not support `faddqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `faddqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `faddqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `faddqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `faddqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `fmaxnmqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `fmaxnmqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `fmaxnmqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `fmaxnmqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `fmaxnmqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `fmaxqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `fmaxqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `fmaxqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `fmaxqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `fmaxqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `fminnmqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `fminnmqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `fminnmqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `fminnmqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `fminnmqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `fminqv v1.8h,p1,z8.h'
+.*: Error: selected processor does not support `fminqv v2.4s,p2,z4.s'
+.*: Error: selected processor does not support `fminqv v4.2d,p3,z2.d'
+.*: Error: selected processor does not support `fminqv v8.2d,p4,z1.d'
+.*: Error: selected processor does not support `fminqv v16.4s,p7,z0.s'
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.d b/gas/testsuite/gas/aarch64/sve2p1-1.d
index 6d718aec7cad5511bda282865d0667a6cdaa188d..437ce9789834683963910141c1468ad46b273ded 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.d
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.d
@@ -64,3 +64,28 @@
.*: 056c2444 extq z4.b, z4.b, z12.b\[1\]
.*: 05672508 extq z8.b, z8.b, z7.b\[4\]
.*: 05612610 extq z16.b, z16.b, z1.b\[8\]
+.*: 6450a501 faddqv v1.8h, p1, z8.h
+.*: 6490a882 faddqv v2.4s, p2, z4.s
+.*: 64d0ac44 faddqv v4.2d, p3, z2.d
+.*: 64d0b028 faddqv v8.2d, p4, z1.d
+.*: 6490bc10 faddqv v16.4s, p7, z0.s
+.*: 6454a501 fmaxnmqv v1.8h, p1, z8.h
+.*: 6494a882 fmaxnmqv v2.4s, p2, z4.s
+.*: 64d4ac44 fmaxnmqv v4.2d, p3, z2.d
+.*: 64d4b028 fmaxnmqv v8.2d, p4, z1.d
+.*: 6494bc10 fmaxnmqv v16.4s, p7, z0.s
+.*: 6456a501 fmaxqv v1.8h, p1, z8.h
+.*: 6496a882 fmaxqv v2.4s, p2, z4.s
+.*: 64d6ac44 fmaxqv v4.2d, p3, z2.d
+.*: 64d6b028 fmaxqv v8.2d, p4, z1.d
+.*: 6496bc10 fmaxqv v16.4s, p7, z0.s
+.*: 6455a501 fminnmqv v1.8h, p1, z8.h
+.*: 6495a882 fminnmqv v2.4s, p2, z4.s
+.*: 64d5ac44 fminnmqv v4.2d, p3, z2.d
+.*: 64d5b028 fminnmqv v8.2d, p4, z1.d
+.*: 6495bc10 fminnmqv v16.4s, p7, z0.s
+.*: 6457a501 fminqv v1.8h, p1, z8.h
+.*: 6497a882 fminqv v2.4s, p2, z4.s
+.*: 64d7ac44 fminqv v4.2d, p3, z2.d
+.*: 64d7b028 fminqv v8.2d, p4, z1.d
+.*: 6497bc10 fminqv v16.4s, p7, z0.s
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.s b/gas/testsuite/gas/aarch64/sve2p1-1.s
index 5278dcf5e67b4cb34ab45b2b2725ab3af14c2594..b4908b2be38d927bb61a38e5aba681837d8417e1 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.s
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.s
@@ -61,3 +61,32 @@ extq z2.b, z2.b, z5.b[3]
extq z4.b, z4.b, z12.b[1]
extq z8.b, z8.b, z7.b[4]
extq z16.b, z16.b, z1.b[8]
+faddqv v1.8h, p1, z8.h
+faddqv v2.4s, p2, z4.s
+faddqv v4.2d, p3, z2.d
+faddqv v8.2d, p4, z1.d
+faddqv v16.4s, p7, z0.s
+
+fmaxnmqv v1.8h, p1, z8.h
+fmaxnmqv v2.4s, p2, z4.s
+fmaxnmqv v4.2d, p3, z2.d
+fmaxnmqv v8.2d, p4, z1.d
+fmaxnmqv v16.4s, p7, z0.s
+
+fmaxqv v1.8h, p1, z8.h
+fmaxqv v2.4s, p2, z4.s
+fmaxqv v4.2d, p3, z2.d
+fmaxqv v8.2d, p4, z1.d
+fmaxqv v16.4s, p7, z0.s
+
+fminnmqv v1.8h, p1, z8.h
+fminnmqv v2.4s, p2, z4.s
+fminnmqv v4.2d, p3, z2.d
+fminnmqv v8.2d, p4, z1.d
+fminnmqv v16.4s, p7, z0.s
+
+fminqv v1.8h, p1, z8.h
+fminqv v2.4s, p2, z4.s
+fminqv v4.2d, p3, z2.d
+fminqv v8.2d, p4, z1.d
+fminqv v16.4s, p7, z0.s
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index 07f4eb319e9be1a8150224b59aba1ab831e51b29..f01ca2abf59a9e6c99f3e742e9db8f46bb1c2a5e 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1922,6 +1922,12 @@
QLF3(V_4S,NIL,S_S), \
QLF3(V_2D,NIL,S_D), \
}
+#define OP_SVE_vUS_HSD_HSD \
+{ \
+ QLF3(V_8H,NIL,S_H), \
+ QLF3(V_4S,NIL,S_S), \
+ QLF3(V_2D,NIL,S_D), \
+}
#define OP_SVE_VMV_SD \
{ \
QLF3(S_S,P_M,S_S), \
@@ -6339,6 +6345,12 @@ const struct aarch64_opcode aarch64_opcode_table[] =
SVE2p1_INSNC("uminqv",0x040f2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
SVE2p1_INSNC("eorqv",0x041d2000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_BHSD_BHSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("faddqv",0x6410a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("fmaxnmqv",0x6414a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("fmaxqv",0x6416a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("fminnmqv",0x6415a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("fminqv",0x6417a000, 0xff3fe000, sve2_urqvs, 0, OP3 (Vd, SVE_Pg3, SVE_Zn), OP_SVE_vUS_HSD_HSD, F_OPD_SIZE, C_SCAN_MOVPRFX, 0),
+
SVE2p1_INSN("dupq",0x05202400, 0xffe0fc00, sve_index1, 0, OP2 (SVE_Zd, SVE_Zn_5_INDEX), OP_SVE_VV_BHSD, 0, 0),
SVE2p1_INSN("extq",0x05602400, 0xfff0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zd, SVE_Zm_imm4), OP_SVE_BBB, 0, 0),
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 6/6][Binutils] aarch64: Add SVE2.1 Contiguous load/store instructions.
2024-01-15 9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
` (2 preceding siblings ...)
2024-01-15 9:38 ` PATCH 5/6][Binutils] aarch64: Add SVE2.1 fmin and fmax instructions Srinath Parvathaneni
@ 2024-01-15 9:40 ` Srinath Parvathaneni
2024-01-15 11:46 ` [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Nick Clifton
4 siblings, 0 replies; 7+ messages in thread
From: Srinath Parvathaneni @ 2024-01-15 9:40 UTC (permalink / raw)
To: binutils; +Cc: richard.earnshaw, nickc
[-- Attachment #1: Type: text/plain, Size: 223 bytes --]
Hi,
This patch add support for SVE2.1 instructions ld1q,
ld2q, ld3q and ld4q, st1q, st2q, st3q and st4q.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
[-- Attachment #2: 6_6.patch --]
[-- Type: text/x-patch, Size: 12646 bytes --]
diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index 0665732fe03cc59df4ebd36ee1afbad08c22b72e..5eff6a754adea9c44432e3faacf31d20c4f6fb98 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -6749,6 +6749,9 @@ parse_operands (char *str, const aarch64_opcode *opcode)
case AARCH64_OPND_SVE_ZtxN:
case AARCH64_OPND_SME_Zdnx2:
case AARCH64_OPND_SME_Zdnx4:
+ case AARCH64_OPND_SME_Zt2:
+ case AARCH64_OPND_SME_Zt3:
+ case AARCH64_OPND_SME_Zt4:
case AARCH64_OPND_SME_Zmx2:
case AARCH64_OPND_SME_Zmx4:
case AARCH64_OPND_SME_Znx2:
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
index 08aef46de61a6cbbe88ebac77da03ee97c9ebe7c..50a4bacc73c20324ae50b8688dd8cf5123a238ae 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
+++ b/gas/testsuite/gas/aarch64/sve2p1-1-bad.l
@@ -80,3 +80,17 @@
.*: Error: selected processor does not support `fminqv v4.2d,p3,z2.d'
.*: Error: selected processor does not support `fminqv v8.2d,p4,z1.d'
.*: Error: selected processor does not support `fminqv v16.4s,p7,z0.s'
+.*: Error: selected processor does not support `ld1q Z0.Q,p4/Z,\[Z16.D,x0\]'
+.*: Error: selected processor does not support `ld2q {Z0.Q,Z1.Q},p4/Z,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `ld3q {Z0.Q,Z1.Q,Z2.Q},p4/Z,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `ld4q {Z0.Q,Z1.Q,Z2.Q,Z3.Q},p4/Z,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `ld2q {Z0.Q,Z1.Q},p4/Z,\[x0,x2,lsl#4\]'
+.*: Error: selected processor does not support `ld3q {Z0.Q,Z1.Q,Z2.Q},p4/Z,\[x0,x4,lsl#4\]'
+.*: Error: selected processor does not support `ld4q {Z0.Q,Z1.Q,Z2.Q,Z3.Q},p4/Z,\[x0,x6,lsl#4\]'
+.*: Error: selected processor does not support `st1q Z0.Q,p4,\[Z16.D,x0\]'
+.*: Error: selected processor does not support `st2q {Z0.Q,Z1.Q},p4,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `st3q {Z0.Q,Z1.Q,Z2.Q},p4,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `st4q {Z0.Q,Z1.Q,Z2.Q,Z3.Q},p4,\[x0,#-4,MUL VL\]'
+.*: Error: selected processor does not support `st2q {Z0.Q,Z1.Q},p4,\[x0,x2,lsl#4\]'
+.*: Error: selected processor does not support `st3q {Z0.Q,Z1.Q,Z2.Q},p4,\[x0,x4,lsl#4\]'
+.*: Error: selected processor does not support `st4q {Z0.Q,Z1.Q,Z2.Q,Z3.Q},p4,\[x0,x6,lsl#4\]'
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.d b/gas/testsuite/gas/aarch64/sve2p1-1.d
index 437ce9789834683963910141c1468ad46b273ded..daece899b38bba4daa2ca9e58dba2d551f6cf988 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.d
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.d
@@ -89,3 +89,17 @@
.*: 64d7ac44 fminqv v4.2d, p3, z2.d
.*: 64d7b028 fminqv v8.2d, p4, z1.d
.*: 6497bc10 fminqv v16.4s, p7, z0.s
+.*: c400b200 ld1q z0.q, p4/z, \[z16.d, x0\]
+.*: a49ef000 ld2q {z0.q, z1.q}, p4/z, \[x0, #-4, mul vl\]
+.*: a51ef000 ld3q {z0.q, z1.q, z2.q}, p4/z, \[x0, #-4, mul vl\]
+.*: a59ef000 ld4q {z0.q, z1.q, z2.q, z3.q}, p4/z, \[x0, #-4, mul vl\]
+.*: a4a2f000 ld2h {z0.h-z1.h}, p4/z, \[x0, #4, mul vl\]
+.*: a5249000 ld3q {z0.q, z1.q, z2.q}, p4/z, \[x0, x4, lsl #4\]
+.*: a5a69000 ld4q {z0.q, z1.q, z2.q, z3.q}, p4/z, \[x0, x6, lsl #4\]
+.*: e4203200 st1q z0.q, p4, \[z16.d, x0\]
+.*: e44e1000 st2q {z0.q, z1.q}, p4, \[x0, #-4, mul vl\]
+.*: e48e1000 st3q {z0.q, z1.q, z2.q}, p4, \[x0, #-4, mul vl\]
+.*: e4ce1000 st4q {z0.q, z1.q, z2.q, z3.q}, p4, \[x0, #-4, mul vl\]
+.*: e4621000 st2q {z0.q, z1.q}, p4, \[x0, x2, lsl #4\]
+.*: e4a41000 st3q {z0.q, z1.q, z2.q}, p4, \[x0, x4, lsl #4\]
+.*: e4e61000 st4q {z0.q, z1.q, z2.q, z3.q}, p4, \[x0, x6, lsl #4\]
diff --git a/gas/testsuite/gas/aarch64/sve2p1-1.s b/gas/testsuite/gas/aarch64/sve2p1-1.s
index b4908b2be38d927bb61a38e5aba681837d8417e1..2a1c7c107d757ae922cec5566adbace1f03e0dce 100644
--- a/gas/testsuite/gas/aarch64/sve2p1-1.s
+++ b/gas/testsuite/gas/aarch64/sve2p1-1.s
@@ -90,3 +90,18 @@ fminqv v2.4s, p2, z4.s
fminqv v4.2d, p3, z2.d
fminqv v8.2d, p4, z1.d
fminqv v16.4s, p7, z0.s
+ld1q Z0.Q, p4/Z, [Z16.D, x0]
+ld2q {Z0.Q, Z1.Q}, p4/Z, [x0, #-4, MUL VL]
+ld3q {Z0.Q, Z1.Q, Z2.Q}, p4/Z, [x0, #-4, MUL VL]
+ld4q {Z0.Q, Z1.Q, Z2.Q, Z3.Q}, p4/Z, [x0, #-4, MUL VL]
+ld2q {Z0.Q, Z1.Q}, p4/Z, [x0, x2, lsl #4]
+ld3q {Z0.Q, Z1.Q, Z2.Q}, p4/Z, [x0, x4, lsl #4]
+ld4q {Z0.Q, Z1.Q, Z2.Q, Z3.Q}, p4/Z, [x0, x6, lsl #4]
+
+st1q Z0.Q, p4, [Z16.D, x0]
+st2q {Z0.Q, Z1.Q}, p4, [x0, #-4, MUL VL]
+st3q {Z0.Q, Z1.Q, Z2.Q}, p4, [x0, #-4, MUL VL]
+st4q {Z0.Q, Z1.Q, Z2.Q, Z3.Q}, p4, [x0, #-4, MUL VL]
+st2q {Z0.Q, Z1.Q}, p4, [x0, x2, lsl #4]
+st3q {Z0.Q, Z1.Q, Z2.Q}, p4, [x0, x4, lsl #4]
+st4q {Z0.Q, Z1.Q, Z2.Q, Z3.Q}, p4, [x0, x6, lsl #4]
diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
index de161db75d509b0ac96c604da7bc9743193d23b2..189bab5a92bcacb1ece30752817f666a34f5d81d 100644
--- a/include/opcode/aarch64.h
+++ b/include/opcode/aarch64.h
@@ -797,6 +797,9 @@ enum aarch64_opnd
AARCH64_OPND_MOPS_WB_Rn, /* Rn!, in bits [5, 9]. */
AARCH64_OPND_CSSC_SIMM8, /* CSSC signed 8-bit immediate. */
AARCH64_OPND_CSSC_UIMM8, /* CSSC unsigned 8-bit immediate. */
+ AARCH64_OPND_SME_Zt2, /* Qobule SVE vector register list. */
+ AARCH64_OPND_SME_Zt3, /* Trible SVE vector register list. */
+ AARCH64_OPND_SME_Zt4, /* Quad SVE vector register list. */
};
/* Qualifier constrains an operand. It either specifies a variant of an
diff --git a/opcodes/aarch64-dis.h b/opcodes/aarch64-dis.h
index 30212f2ae2c2759b5667e5a007912d22c4a702fc..48bebfea1e146e71d5fcae67c6558a35fe198e3f 100644
--- a/opcodes/aarch64-dis.h
+++ b/opcodes/aarch64-dis.h
@@ -139,6 +139,7 @@ AARCH64_DECL_OPD_EXTRACTOR (ext_imm_rotate2);
AARCH64_DECL_OPD_EXTRACTOR (ext_x0_to_x30);
AARCH64_DECL_OPD_EXTRACTOR (ext_simple_index);
AARCH64_DECL_OPD_EXTRACTOR (ext_plain_shrimm);
+AARCH64_DECL_OPD_EXTRACTOR (ext_sve_reglist_zt);
#undef AARCH64_DECL_OPD_EXTRACTOR
diff --git a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c
index 1381e7524402a867cee23becbaa693d1b293c28d..9e96ba35ed45a404426467b897e379ba44e7e51a 100644
--- a/opcodes/aarch64-dis.c
+++ b/opcodes/aarch64-dis.c
@@ -2160,6 +2160,21 @@ aarch64_ext_sve_reglist (const aarch64_operand *self,
return true;
}
+/* Decode {Zn.<T> , Zm.<T>}. The fields array specifies which field
+ to use for Zn. The opcode-dependent value specifies the number
+ of registers in the list. */
+bool
+aarch64_ext_sve_reglist_zt (const aarch64_operand *self,
+ aarch64_opnd_info *info, aarch64_insn code,
+ const aarch64_inst *inst ATTRIBUTE_UNUSED,
+ aarch64_operand_error *errors ATTRIBUTE_UNUSED)
+{
+ info->reglist.first_regno = extract_field (self->fields[0], code, 0);
+ info->reglist.num_regs = get_operand_specific_data (self);
+ info->reglist.stride = 1;
+ return true;
+}
+
/* Decode a strided register list. The first field holds the top bit
(0 or 16) and the second field holds the lower bits. The stride is
16 divided by the list length. */
diff --git a/opcodes/aarch64-opc.c b/opcodes/aarch64-opc.c
index 1d8ed26c7090e4b73489b15e74a911e33b54555c..13cd2bcd8a7a79508c340bcf618af61b622bc0fe 100644
--- a/opcodes/aarch64-opc.c
+++ b/opcodes/aarch64-opc.c
@@ -1870,6 +1870,9 @@ operand_general_constraint_met_p (const aarch64_opnd_info *opnds, int idx,
case AARCH64_OPND_SME_Zmx4:
case AARCH64_OPND_SME_Znx2:
case AARCH64_OPND_SME_Znx4:
+ case AARCH64_OPND_SME_Zt2:
+ case AARCH64_OPND_SME_Zt3:
+ case AARCH64_OPND_SME_Zt4:
num = get_operand_specific_data (&aarch64_operands[type]);
if (!check_reglist (opnd, mismatch_detail, idx, num, 1))
return 0;
@@ -3626,7 +3629,10 @@ print_register_list (char *buf, size_t size, const aarch64_opnd_info *opnd,
/* The hyphenated form is preferred for disassembly if there are
more than two registers in the list, and the register numbers
are monotonically increasing in increments of one. */
- if (stride == 1 && num_regs > 1)
+ if (stride == 1 && num_regs > 1
+ && ((opnd->type != AARCH64_OPND_SME_Zt2)
+ && (opnd->type != AARCH64_OPND_SME_Zt3)
+ && (opnd->type != AARCH64_OPND_SME_Zt4)))
snprintf (buf, size, "{%s-%s}%s",
style_reg (styler, "%s%d.%s", prefix, first_reg, qlf_name),
style_reg (styler, "%s%d.%s", prefix, last_reg, qlf_name), tb);
@@ -4071,6 +4077,9 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
case AARCH64_OPND_SME_Znx4:
case AARCH64_OPND_SME_Ztx2_STRIDED:
case AARCH64_OPND_SME_Ztx4_STRIDED:
+ case AARCH64_OPND_SME_Zt2:
+ case AARCH64_OPND_SME_Zt3:
+ case AARCH64_OPND_SME_Zt4:
print_register_list (buf, size, opnd, "z", styler);
break;
diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
index 383091ef199310b21a0741527eca50bb4a10e668..c5c5c612e508b29ab99d60e0fae20d2c8fcccde4 100644
--- a/opcodes/aarch64-tbl.h
+++ b/opcodes/aarch64-tbl.h
@@ -1781,6 +1781,14 @@
{ \
QLF3(S_S,P_Z,S_S), \
}
+#define OP_SVE_SZS_QD \
+{ \
+ QLF3(S_Q,P_Z,S_D), \
+}
+#define OP_SVE_SUS_QD \
+{ \
+ QLF3(S_Q,NIL,S_D), \
+}
#define OP_SVE_SBB \
{ \
QLF3(S_S,S_B,S_B), \
@@ -6353,6 +6361,21 @@ const struct aarch64_opcode aarch64_opcode_table[] =
SVE2p1_INSN("dupq",0x05202400, 0xffe0fc00, sve_index1, 0, OP2 (SVE_Zd, SVE_Zn_5_INDEX), OP_SVE_VV_BHSD, 0, 0),
SVE2p1_INSN("extq",0x05602400, 0xfff0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zd, SVE_Zm_imm4), OP_SVE_BBB, 0, 0),
+ SVE2p1_INSNC("ld1q",0xc400a000, 0xffe0e000, sve_misc, 0, OP3 (SVE_Zt, SVE_Pg3, SVE_ADDR_ZX), OP_SVE_SZS_QD, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("ld2q",0xa490e000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt2, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("ld3q",0xa510e000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt3, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("ld4q",0xa590e000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt4, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("ld2q",0xa4a0e000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt2, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("ld3q",0xa5208000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt3, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("ld4q",0xa5a08000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt4, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QZU, 0, C_SCAN_MOVPRFX, 0),
+
+ SVE2p1_INSNC("st1q",0xe4202000, 0xffe0e000, sve_misc, 0, OP3 (SVE_Zt, SVE_Pg3, SVE_ADDR_ZX), OP_SVE_SUS_QD, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("st2q",0xe4400000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt2, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("st3q",0xe4800000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt3, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("st4q",0xe4c00000, 0xfff0e000, sve_misc, 0, OP3 (SME_Zt4, SVE_Pg3, SVE_ADDR_RI_S4x2xVL), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("st2q",0xe4600000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt2, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("st3q",0xe4a00000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt3, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
+ SVE2p1_INSNC("st4q",0xe4e00000, 0xffe0e000, sve_misc, 0, OP3 (SME_Zt4, SVE_Pg3, SVE_ADDR_RR_LSL4), OP_SVE_QUU, 0, C_SCAN_MOVPRFX, 0),
{0, 0, 0, 0, 0, 0, {}, {}, 0, 0, 0, NULL},
};
@@ -6989,4 +7012,13 @@ const struct aarch64_opcode aarch64_opcode_table[] =
Y(IMMEDIATE, imm, "CSSC_SIMM8", OPD_F_SEXT, F(FLD_CSSC_imm8), \
"an 8-bit signed immediate") \
Y(IMMEDIATE, imm, "CSSC_UIMM8", 0, F(FLD_CSSC_imm8), \
- "an 8-bit unsigned immediate")
+ "an 8-bit unsigned immediate") \
+ X(SVE_REGLIST, ins_sve_reglist, ext_sve_reglist_zt, "SME_Zt2", \
+ 2 << OPD_F_OD_LSB, F(FLD_SVE_Zt), \
+ "a list of 2 SVE vector registers") \
+ X(SVE_REGLIST, ins_sve_reglist, ext_sve_reglist_zt, "SME_Zt3", \
+ 3 << OPD_F_OD_LSB, F(FLD_SVE_Zt), \
+ "a list of 3 SVE vector registers") \
+ X(SVE_REGLIST, ins_sve_reglist, ext_sve_reglist_zt, "SME_Zt4", \
+ 4 << OPD_F_OD_LSB, F(FLD_SVE_Zt), \
+ "a list of 4 SVE vector registers")
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions.
2024-01-15 9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
` (3 preceding siblings ...)
2024-01-15 9:40 ` [PATCH 6/6][Binutils] aarch64: Add SVE2.1 Contiguous load/store instructions Srinath Parvathaneni
@ 2024-01-15 11:46 ` Nick Clifton
4 siblings, 0 replies; 7+ messages in thread
From: Nick Clifton @ 2024-01-15 11:46 UTC (permalink / raw)
To: Srinath Parvathaneni, binutils; +Cc: richard.earnshaw
Hi Srinath,
> This patch add support for SVE2.1 and SME2.1 non-widening BFloat16
> (FEAT_B16B16) instructions.
>
> Following instructions predicated, unpredicated and indexed
> variants are added in this patch.
>
> bfadd, bfclamp, bfmax bfmaxnm, bfmin,bfminnm,
> bfmla,bfmls,bfmul and bfsub.
>
> Regression testing for aarch64-none-elf target and found no regressions.
>
> Ok for binutils-master?
Patch series approved and applied.
Cheers
Nick
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-01-15 11:46 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-15 9:28 [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Srinath Parvathaneni
2024-01-15 9:34 ` [PATCH 2/6][Binutils] aarch64: Add support for FEAT_SME2p1 instructions Srinath Parvathaneni
2024-01-15 9:35 ` [PATCH 3/6][Binutils] aarch64: Add support for FEAT_SVE2p1 Srinath Parvathaneni
2024-01-15 9:37 ` [PATCH 4/6][Binutils] aarch64: Add SVE2.1 dupq, eorqv and extq instructions Srinath Parvathaneni
2024-01-15 9:38 ` PATCH 5/6][Binutils] aarch64: Add SVE2.1 fmin and fmax instructions Srinath Parvathaneni
2024-01-15 9:40 ` [PATCH 6/6][Binutils] aarch64: Add SVE2.1 Contiguous load/store instructions Srinath Parvathaneni
2024-01-15 11:46 ` [PATCH 1/6] [Binutils] aarch64: Add support for FEAT_B16B16 instructions Nick Clifton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).