Hi All, This patch adds the expander support for supporting autovectorization of complex number operations such as Complex addition with a rotation along the Argand plane. This also adds support for complex FMA. The instructions are described in the ArmARM [1] and are available from Armv8.3-a onwards. Concretely, this generates f90: add ip, r1, #15 add r3, r0, #15 sub r3, r3, r2 sub ip, ip, r2 cmp ip, #30 cmphi r3, #30 add r3, r0, #1600 bls .L5 .L3: vld1.32 {q8}, [r0]! vld1.32 {q9}, [r1]! vcadd.f32 q8, q8, q9, #90 vst1.32 {q8}, [r2]! cmp r0, r3 bne .L3 bx lr .L5: vld1.32 {d16}, [r0]! vld1.32 {d17}, [r1]! vcadd.f32 d16, d16, d17, #90 vst1.32 {d16}, [r2]! cmp r0, r3 bne .L5 bx lr now instead of f90: add ip, r1, #31 add r3, r0, #31 sub r3, r3, r2 sub ip, ip, r2 cmp ip, #62 cmphi r3, #62 add r3, r0, #1600 bls .L2 .L3: vld2.32 {d20-d23}, [r0]! vld2.32 {d24-d27}, [r1]! cmp r0, r3 vsub.f32 q8, q10, q13 vadd.f32 q9, q12, q11 vst2.32 {d16-d19}, [r2]! bne .L3 bx lr .L2: vldr d19, .L10 .L5: vld1.32 {d16}, [r1]! vld1.32 {d18}, [r0]! vrev64.32 d16, d16 cmp r0, r3 vsub.f32 d17, d18, d16 vadd.f32 d16, d16, d18 vswp d16, d17 vtbl.8 d16, {d16, d17}, d19 vst1.32 {d16}, [r2]! bne .L5 bx lr .L11: .align 3 .L10: .byte 0 .byte 1 .byte 2 .byte 3 .byte 12 .byte 13 .byte 14 .byte 15 For complex additions with a 90* rotation along the Argand plane. [1] https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile Bootstrap and Regtest on aarch64-none-linux-gnu, arm-none-gnueabihf and x86_64-pc-linux-gnu are still on going but previous patch showed no regressions. The instructions have also been tested on aarch64-none-elf and arm-none-eabi on a Armv8.3-a model and -march=Armv8.3-a+fp16 and all tests pass. Ok for trunk? Thanks, Tamar gcc/ChangeLog: 2018-11-11 Tamar Christina * config/arm/arm.c (arm_arch8_3, arm_arch8_4): New. * config/arm/arm.h (TARGET_COMPLEX, arm_arch8_3, arm_arch8_4): New. (arm_option_reconfigure_globals): Use them. * config/arm/iterators.md (VDF, VQ_HSF): New. (VCADD, VCMLA): New. (VF_constraint, rot, rotsplit1, rotsplit2): Add V4HF and V8HF. * config/arm/neon.md (neon_vcadd, fcadd3, neon_vcmla, fcmla4): New. * config/arm/unspecs.md (UNSPEC_VCADD90, UNSPEC_VCADD270, UNSPEC_VCMLA, UNSPEC_VCMLA90, UNSPEC_VCMLA180, UNSPEC_VCMLA270): New. gcc/testsuite/ChangeLog: 2018-11-11 Tamar Christina * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_1.c: Add Arm support. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_3.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_4.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_5.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_6.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_3.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_4.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_5.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_6.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_3.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_3.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_3.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_3.c: Likewise. --