Hi, The previous version of this patch shared part of its code with the store intrinsics patch (https://gcc.gnu.org/ml/gcc-patches/2020-03/msg00145.html) so I removed any duplicated code. This patch now depends on the previously mentioned store intrinsics patch. Here is the latest version and the updated ChangeLog. gcc/ChangeLog: 2019-03-04 Delia Burduv * config/arm/arm_neon.h (bfloat16_t): New typedef. (vld2_bf16): New. (vld2q_bf16): New. (vld3_bf16): New. (vld3q_bf16): New. (vld4_bf16): New. (vld4q_bf16): New. (vld2_dup_bf16): New. (vld2q_dup_bf16): New. (vld3_dup_bf16): New. (vld3q_dup_bf16): New. (vld4_dup_bf16): New. (vld4q_dup_bf16): New. * config/arm/arm_neon_builtins.def (vld2): Changed to VAR13 and added v4bf, v8bf (vld2_dup): Changed to VAR8 and added v4bf, v8bf (vld3): Changed to VAR13 and added v4bf, v8bf (vld3_dup): Changed to VAR8 and added v4bf, v8bf (vld4): Changed to VAR13 and added v4bf, v8bf (vld4_dup): Changed to VAR8 and added v4bf, v8bf * config/arm/iterators.md (VDXBF): New iterator. (VQ2BF): New iterator. *config/arm/neon.md (vld2): Used new iterators. (vld2_dup): Used new iterators. (vld2_dupv8bf): New. (vst3): Used new iterators. (vst3qa): Used new iterators. (vst3qb): Used new iterators. (vld3_dup): Used new iterators. (vld3_dupv8bf): New. (vst4): Used new iterators. (vst4qa): Used new iterators. (vst4qb): Used new iterators. (vld4_dup): Used new iterators. (vld4_dupv8bf): New. gcc/testsuite/ChangeLog: 2019-03-04 Delia Burduv * gcc.target/arm/simd/bf16_vldn_1.c: New test. Thanks, Delia On 2/19/20 5:25 PM, Delia Burduv wrote: > > Hi, > > Here is the latest version of the patch. It just has some minor > formatting changes that were brought up by Richard Sandiford in the > AArch64 patches > > Thanks, > Delia > > On 1/22/20 5:31 PM, Delia Burduv wrote: >> Ping. >> >> I will change the tests to use the exact input and output registers as >> Richard Sandiford suggested for the AArch64 patches. >> >> On 12/20/19 6:48 PM, Delia Burduv wrote: >>> This patch adds the ARMv8.6 ACLE BFloat16 load intrinsics >>> vld{q}_bf16 as part of the BFloat16 extension. >>> (https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics) >>> >>> The intrinsics are declared in arm_neon.h . >>> A new test is added to check assembler output. >>> >>> This patch depends on the Arm back-end patche. >>> (https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html) >>> >>> Tested for regression on arm-none-eabi and armeb-none-eabi. I don't >>> have commit rights, so if this is ok can someone please commit it for >>> me? >>> >>> gcc/ChangeLog: >>> >>> 2019-11-14  Delia Burduv  >>> >>>      * config/arm/arm_neon.h (bfloat16_t): New typedef. >>>          (bfloat16x4x2_t): New typedef. >>>          (bfloat16x8x2_t): New typedef. >>>          (bfloat16x4x3_t): New typedef. >>>          (bfloat16x8x3_t): New typedef. >>>          (bfloat16x4x4_t): New typedef. >>>          (bfloat16x8x4_t): New typedef. >>>          (vld2_bf16): New. >>>      (vld2q_bf16): New. >>>      (vld3_bf16): New. >>>      (vld3q_bf16): New. >>>      (vld4_bf16): New. >>>      (vld4q_bf16): New. >>>      (vld2_dup_bf16): New. >>>      (vld2q_dup_bf16): New. >>>       (vld3_dup_bf16): New. >>>      (vld3q_dup_bf16): New. >>>      (vld4_dup_bf16): New. >>>      (vld4q_dup_bf16): New. >>>          * config/arm/arm-builtins.c (E_V2BFmode): New mode. >>>          (VAR13): New. >>>          (arm_simd_types[Bfloat16x2_t]):New type. >>>          * config/arm/arm-modes.def (V2BF): New mode. >>>          * config/arm/arm-simd-builtin-types.def >>>          (Bfloat16x2_t): New entry. >>>          * config/arm/arm_neon_builtins.def >>>          (vld2): Changed to VAR13 and added v4bf, v8bf >>>          (vld2_dup): Changed to VAR8 and added v4bf, v8bf >>>          (vld3): Changed to VAR13 and added v4bf, v8bf >>>          (vld3_dup): Changed to VAR8 and added v4bf, v8bf >>>          (vld4): Changed to VAR13 and added v4bf, v8bf >>>          (vld4_dup): Changed to VAR8 and added v4bf, v8bf >>>          * config/arm/iterators.md (VDXBF): New iterator. >>>          (VQ2BF): New iterator. >>>          (V_elem): Added V4BF, V8BF. >>>          (V_sz_elem): Added V4BF, V8BF. >>>          (V_mode_nunits): Added V4BF, V8BF. >>>          (q): Added V4BF, V8BF. >>>          *config/arm/neon.md (vld2): Used new iterators. >>>          (vld2_dup): Used new iterators. >>>          (vld2_dupv8bf): New. >>>          (vst3): Used new iterators. >>>          (vst3qa): Used new iterators. >>>          (vst3qb): Used new iterators. >>>          (vld3_dup): Used new iterators. >>>          (vld3_dupv8bf): New. >>>          (vst4): Used new iterators. >>>          (vst4qa): Used new iterators. >>>          (vst4qb): Used new iterators. >>>          (vld4_dup): Used new iterators. >>>          (vld4_dupv8bf): New. >>> >>> >>> gcc/testsuite/ChangeLog: >>> >>> 2019-11-14  Delia Burduv  >>> >>>      * gcc.target/arm/simd/bf16_vldn_1.c: New test.