On 5/2/23 12:26, Kyrylo Tkachov wrote: > Hi Christophe, > >> -----Original Message----- >> From: Christophe Lyon >> Sent: Tuesday, April 18, 2023 2:46 PM >> To:gcc-patches@gcc.gnu.org; Kyrylo Tkachov; >> Richard Earnshaw; Richard Sandiford >> >> Cc: Christophe Lyon >> Subject: [PATCH 03/22] arm: [MVE intrinsics] Rework vreinterpretq >> >> This patch implements vreinterpretq using the new MVE intrinsics >> framework. >> >> The old definitions for vreinterpretq are removed as a consequence. >> >> 2022-09-08 Murray Steele >> Christophe Lyon >> >> gcc/ >> * config/arm/arm-mve-builtins-base.cc (vreinterpretq_impl): New >> class. >> * config/arm/arm-mve-builtins-base.def: Define vreinterpretq. >> * config/arm/arm-mve-builtins-base.h (vreinterpretq): New >> declaration. >> * config/arm/arm-mve-builtins-shapes.cc (parse_element_type): New >> function. >> (parse_type): Likewise. >> (parse_signature): Likewise. >> (build_one): Likewise. >> (build_all): Likewise. >> (overloaded_base): New struct. >> (unary_convert_def): Likewise. >> * config/arm/arm-mve-builtins-shapes.h (unary_convert): Declare. >> * config/arm/arm-mve-builtins.cc (TYPES_reinterpret_signed1): New >> macro. >> (TYPES_reinterpret_unsigned1): Likewise. >> (TYPES_reinterpret_integer): Likewise. >> (TYPES_reinterpret_integer1): Likewise. >> (TYPES_reinterpret_float1): Likewise. >> (TYPES_reinterpret_float): Likewise. >> (reinterpret_integer): New. >> (reinterpret_float): New. >> (handle_arm_mve_h): Register builtins. >> * config/arm/arm_mve.h (vreinterpretq_s16): Remove. >> (vreinterpretq_s32): Likewise. >> (vreinterpretq_s64): Likewise. >> (vreinterpretq_s8): Likewise. >> (vreinterpretq_u16): Likewise. >> (vreinterpretq_u32): Likewise. >> (vreinterpretq_u64): Likewise. >> (vreinterpretq_u8): Likewise. >> (vreinterpretq_f16): Likewise. >> (vreinterpretq_f32): Likewise. >> (vreinterpretq_s16_s32): Likewise. >> (vreinterpretq_s16_s64): Likewise. >> (vreinterpretq_s16_s8): Likewise. >> (vreinterpretq_s16_u16): Likewise. >> (vreinterpretq_s16_u32): Likewise. >> (vreinterpretq_s16_u64): Likewise. >> (vreinterpretq_s16_u8): Likewise. >> (vreinterpretq_s32_s16): Likewise. >> (vreinterpretq_s32_s64): Likewise. >> (vreinterpretq_s32_s8): Likewise. >> (vreinterpretq_s32_u16): Likewise. >> (vreinterpretq_s32_u32): Likewise. >> (vreinterpretq_s32_u64): Likewise. >> (vreinterpretq_s32_u8): Likewise. >> (vreinterpretq_s64_s16): Likewise. >> (vreinterpretq_s64_s32): Likewise. >> (vreinterpretq_s64_s8): Likewise. >> (vreinterpretq_s64_u16): Likewise. >> (vreinterpretq_s64_u32): Likewise. >> (vreinterpretq_s64_u64): Likewise. >> (vreinterpretq_s64_u8): Likewise. >> (vreinterpretq_s8_s16): Likewise. >> (vreinterpretq_s8_s32): Likewise. >> (vreinterpretq_s8_s64): Likewise. >> (vreinterpretq_s8_u16): Likewise. >> (vreinterpretq_s8_u32): Likewise. >> (vreinterpretq_s8_u64): Likewise. >> (vreinterpretq_s8_u8): Likewise. >> (vreinterpretq_u16_s16): Likewise. >> (vreinterpretq_u16_s32): Likewise. >> (vreinterpretq_u16_s64): Likewise. >> (vreinterpretq_u16_s8): Likewise. >> (vreinterpretq_u16_u32): Likewise. >> (vreinterpretq_u16_u64): Likewise. >> (vreinterpretq_u16_u8): Likewise. >> (vreinterpretq_u32_s16): Likewise. >> (vreinterpretq_u32_s32): Likewise. >> (vreinterpretq_u32_s64): Likewise. >> (vreinterpretq_u32_s8): Likewise. >> (vreinterpretq_u32_u16): Likewise. >> (vreinterpretq_u32_u64): Likewise. >> (vreinterpretq_u32_u8): Likewise. >> (vreinterpretq_u64_s16): Likewise. >> (vreinterpretq_u64_s32): Likewise. >> (vreinterpretq_u64_s64): Likewise. >> (vreinterpretq_u64_s8): Likewise. >> (vreinterpretq_u64_u16): Likewise. >> (vreinterpretq_u64_u32): Likewise. >> (vreinterpretq_u64_u8): Likewise. >> (vreinterpretq_u8_s16): Likewise. >> (vreinterpretq_u8_s32): Likewise. >> (vreinterpretq_u8_s64): Likewise. >> (vreinterpretq_u8_s8): Likewise. >> (vreinterpretq_u8_u16): Likewise. >> (vreinterpretq_u8_u32): Likewise. >> (vreinterpretq_u8_u64): Likewise. >> (vreinterpretq_s32_f16): Likewise. >> (vreinterpretq_s32_f32): Likewise. >> (vreinterpretq_u16_f16): Likewise. >> (vreinterpretq_u16_f32): Likewise. >> (vreinterpretq_u32_f16): Likewise. >> (vreinterpretq_u32_f32): Likewise. >> (vreinterpretq_u64_f16): Likewise. >> (vreinterpretq_u64_f32): Likewise. >> (vreinterpretq_u8_f16): Likewise. >> (vreinterpretq_u8_f32): Likewise. >> (vreinterpretq_f16_f32): Likewise. >> (vreinterpretq_f16_s16): Likewise. >> (vreinterpretq_f16_s32): Likewise. >> (vreinterpretq_f16_s64): Likewise. >> (vreinterpretq_f16_s8): Likewise. >> (vreinterpretq_f16_u16): Likewise. >> (vreinterpretq_f16_u32): Likewise. >> (vreinterpretq_f16_u64): Likewise. >> (vreinterpretq_f16_u8): Likewise. >> (vreinterpretq_f32_f16): Likewise. >> (vreinterpretq_f32_s16): Likewise. >> (vreinterpretq_f32_s32): Likewise. >> (vreinterpretq_f32_s64): Likewise. >> (vreinterpretq_f32_s8): Likewise. >> (vreinterpretq_f32_u16): Likewise. >> (vreinterpretq_f32_u32): Likewise. >> (vreinterpretq_f32_u64): Likewise. >> (vreinterpretq_f32_u8): Likewise. >> (vreinterpretq_s16_f16): Likewise. >> (vreinterpretq_s16_f32): Likewise. >> (vreinterpretq_s64_f16): Likewise. >> (vreinterpretq_s64_f32): Likewise. >> (vreinterpretq_s8_f16): Likewise. >> (vreinterpretq_s8_f32): Likewise. >> (__arm_vreinterpretq_f16): Likewise. >> (__arm_vreinterpretq_f32): Likewise. >> (__arm_vreinterpretq_s16): Likewise. >> (__arm_vreinterpretq_s32): Likewise. >> (__arm_vreinterpretq_s64): Likewise. >> (__arm_vreinterpretq_s8): Likewise. >> (__arm_vreinterpretq_u16): Likewise. >> (__arm_vreinterpretq_u32): Likewise. >> (__arm_vreinterpretq_u64): Likewise. >> (__arm_vreinterpretq_u8): Likewise. >> * config/arm/arm_mve_types.h (__arm_vreinterpretq_s16_s32): >> Remove. >> (__arm_vreinterpretq_s16_s64): Likewise. >> (__arm_vreinterpretq_s16_s8): Likewise. >> (__arm_vreinterpretq_s16_u16): Likewise. >> (__arm_vreinterpretq_s16_u32): Likewise. >> (__arm_vreinterpretq_s16_u64): Likewise. >> (__arm_vreinterpretq_s16_u8): Likewise. >> (__arm_vreinterpretq_s32_s16): Likewise. >> (__arm_vreinterpretq_s32_s64): Likewise. >> (__arm_vreinterpretq_s32_s8): Likewise. >> (__arm_vreinterpretq_s32_u16): Likewise. >> (__arm_vreinterpretq_s32_u32): Likewise. >> (__arm_vreinterpretq_s32_u64): Likewise. >> (__arm_vreinterpretq_s32_u8): Likewise. >> (__arm_vreinterpretq_s64_s16): Likewise. >> (__arm_vreinterpretq_s64_s32): Likewise. >> (__arm_vreinterpretq_s64_s8): Likewise. >> (__arm_vreinterpretq_s64_u16): Likewise. >> (__arm_vreinterpretq_s64_u32): Likewise. >> (__arm_vreinterpretq_s64_u64): Likewise. >> (__arm_vreinterpretq_s64_u8): Likewise. >> (__arm_vreinterpretq_s8_s16): Likewise. >> (__arm_vreinterpretq_s8_s32): Likewise. >> (__arm_vreinterpretq_s8_s64): Likewise. >> (__arm_vreinterpretq_s8_u16): Likewise. >> (__arm_vreinterpretq_s8_u32): Likewise. >> (__arm_vreinterpretq_s8_u64): Likewise. >> (__arm_vreinterpretq_s8_u8): Likewise. >> (__arm_vreinterpretq_u16_s16): Likewise. >> (__arm_vreinterpretq_u16_s32): Likewise. >> (__arm_vreinterpretq_u16_s64): Likewise. >> (__arm_vreinterpretq_u16_s8): Likewise. >> (__arm_vreinterpretq_u16_u32): Likewise. >> (__arm_vreinterpretq_u16_u64): Likewise. >> (__arm_vreinterpretq_u16_u8): Likewise. >> (__arm_vreinterpretq_u32_s16): Likewise. >> (__arm_vreinterpretq_u32_s32): Likewise. >> (__arm_vreinterpretq_u32_s64): Likewise. >> (__arm_vreinterpretq_u32_s8): Likewise. >> (__arm_vreinterpretq_u32_u16): Likewise. >> (__arm_vreinterpretq_u32_u64): Likewise. >> (__arm_vreinterpretq_u32_u8): Likewise. >> (__arm_vreinterpretq_u64_s16): Likewise. >> (__arm_vreinterpretq_u64_s32): Likewise. >> (__arm_vreinterpretq_u64_s64): Likewise. >> (__arm_vreinterpretq_u64_s8): Likewise. >> (__arm_vreinterpretq_u64_u16): Likewise. >> (__arm_vreinterpretq_u64_u32): Likewise. >> (__arm_vreinterpretq_u64_u8): Likewise. >> (__arm_vreinterpretq_u8_s16): Likewise. >> (__arm_vreinterpretq_u8_s32): Likewise. >> (__arm_vreinterpretq_u8_s64): Likewise. >> (__arm_vreinterpretq_u8_s8): Likewise. >> (__arm_vreinterpretq_u8_u16): Likewise. >> (__arm_vreinterpretq_u8_u32): Likewise. >> (__arm_vreinterpretq_u8_u64): Likewise. >> (__arm_vreinterpretq_s32_f16): Likewise. >> (__arm_vreinterpretq_s32_f32): Likewise. >> (__arm_vreinterpretq_s16_f16): Likewise. >> (__arm_vreinterpretq_s16_f32): Likewise. >> (__arm_vreinterpretq_s64_f16): Likewise. >> (__arm_vreinterpretq_s64_f32): Likewise. >> (__arm_vreinterpretq_s8_f16): Likewise. >> (__arm_vreinterpretq_s8_f32): Likewise. >> (__arm_vreinterpretq_u16_f16): Likewise. >> (__arm_vreinterpretq_u16_f32): Likewise. >> (__arm_vreinterpretq_u32_f16): Likewise. >> (__arm_vreinterpretq_u32_f32): Likewise. >> (__arm_vreinterpretq_u64_f16): Likewise. >> (__arm_vreinterpretq_u64_f32): Likewise. >> (__arm_vreinterpretq_u8_f16): Likewise. >> (__arm_vreinterpretq_u8_f32): Likewise. >> (__arm_vreinterpretq_f16_f32): Likewise. >> (__arm_vreinterpretq_f16_s16): Likewise. >> (__arm_vreinterpretq_f16_s32): Likewise. >> (__arm_vreinterpretq_f16_s64): Likewise. >> (__arm_vreinterpretq_f16_s8): Likewise. >> (__arm_vreinterpretq_f16_u16): Likewise. >> (__arm_vreinterpretq_f16_u32): Likewise. >> (__arm_vreinterpretq_f16_u64): Likewise. >> (__arm_vreinterpretq_f16_u8): Likewise. >> (__arm_vreinterpretq_f32_f16): Likewise. >> (__arm_vreinterpretq_f32_s16): Likewise. >> (__arm_vreinterpretq_f32_s32): Likewise. >> (__arm_vreinterpretq_f32_s64): Likewise. >> (__arm_vreinterpretq_f32_s8): Likewise. >> (__arm_vreinterpretq_f32_u16): Likewise. >> (__arm_vreinterpretq_f32_u32): Likewise. >> (__arm_vreinterpretq_f32_u64): Likewise. >> (__arm_vreinterpretq_f32_u8): Likewise. >> (__arm_vreinterpretq_s16): Likewise. >> (__arm_vreinterpretq_s32): Likewise. >> (__arm_vreinterpretq_s64): Likewise. >> (__arm_vreinterpretq_s8): Likewise. >> (__arm_vreinterpretq_u16): Likewise. >> (__arm_vreinterpretq_u32): Likewise. >> (__arm_vreinterpretq_u64): Likewise. >> (__arm_vreinterpretq_u8): Likewise. >> (__arm_vreinterpretq_f16): Likewise. >> (__arm_vreinterpretq_f32): Likewise. >> * config/arm/mve.md (@arm_mve_reinterpret): New >> pattern. >> * config/arm/unspecs.md: (REINTERPRET): New unspec. >> >> gcc/testsuite/ >> * g++.target/arm/mve.exp: Add general-c++ and general directories. >> * g++.target/arm/mve/general-c++/nomve_fp_1.c: New test. >> * g++.target/arm/mve/general-c++/vreinterpretq_1.C: New test. >> * gcc.target/arm/mve/general-c/nomve_fp_1.c: New test. >> * gcc.target/arm/mve/general-c/vreinterpretq_1.c: New test. >> --- >> gcc/config/arm/arm-mve-builtins-base.cc | 29 + >> gcc/config/arm/arm-mve-builtins-base.def | 2 + >> gcc/config/arm/arm-mve-builtins-base.h | 2 + >> gcc/config/arm/arm-mve-builtins-shapes.cc | 28 + >> gcc/config/arm/arm-mve-builtins-shapes.h | 8 + >> gcc/config/arm/arm-mve-builtins.cc | 60 + >> gcc/config/arm/arm_mve.h | 300 ---- >> gcc/config/arm/arm_mve_types.h | 1365 +---------------- >> gcc/config/arm/mve.md | 18 + >> gcc/config/arm/unspecs.md | 1 + >> gcc/testsuite/g++.target/arm/mve.exp | 8 +- >> .../arm/mve/general-c++/nomve_fp_1.c | 15 + >> .../arm/mve/general-c++/vreinterpretq_1.C | 25 + >> .../gcc.target/arm/mve/general-c/nomve_fp_1.c | 15 + >> .../arm/mve/general-c/vreinterpretq_1.c | 25 + >> 15 files changed, 286 insertions(+), 1615 deletions(-) >> create mode 100644 gcc/testsuite/g++.target/arm/mve/general- >> c++/nomve_fp_1.c >> create mode 100644 gcc/testsuite/g++.target/arm/mve/general- >> c++/vreinterpretq_1.C >> create mode 100644 gcc/testsuite/gcc.target/arm/mve/general- >> c/nomve_fp_1.c >> create mode 100644 gcc/testsuite/gcc.target/arm/mve/general- >> c/vreinterpretq_1.c >> >> diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm- >> mve-builtins-base.cc >> index e9f285faf2b..ad8d500afc6 100644 >> --- a/gcc/config/arm/arm-mve-builtins-base.cc >> +++ b/gcc/config/arm/arm-mve-builtins-base.cc >> @@ -38,8 +38,37 @@ using namespace arm_mve; >> >> namespace { >> >> +/* Implements vreinterpretq_* intrinsics. */ >> +class vreinterpretq_impl : public quiet >> +{ >> + gimple * >> + fold (gimple_folder &f) const override >> + { >> + /* Punt to rtl if the effect of the reinterpret on registers does not >> + conform to GCC's endianness model. */ >> + if (!targetm.can_change_mode_class (f.vector_mode (0), >> + f.vector_mode (1), VFP_REGS)) >> + return NULL; >> + > So we punt to an RTL pattern here if we cannot change mode class... > > [snip] > >> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md >> index 35eab6c94bf..ab688396f97 100644 >> --- a/gcc/config/arm/mve.md >> +++ b/gcc/config/arm/mve.md >> @@ -10561,3 +10561,21 @@ (define_expand >> "vcond_mask_" >> } >> DONE; >> }) >> + >> +;; Reinterpret operand 1 in operand 0's mode, without changing its contents. >> +(define_expand "@arm_mve_reinterpret" >> + [(set (match_operand:MVE_vecs 0 "register_operand") >> + (unspec:MVE_vecs >> + [(match_operand 1 "arm_any_register_operand")] >> + REINTERPRET))] >> + "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (mode)) >> + || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE >> (mode))" >> + { >> + machine_mode src_mode = GET_MODE (operands[1]); >> + if (targetm.can_change_mode_class (mode, src_mode, >> VFP_REGS)) >> + { >> + emit_move_insn (operands[0], gen_lowpart (mode, >> operands[1])); >> + DONE; >> + } >> + } >> +) > ... But we still check can_change_mode_class in this pattern and if it's not true we emit the new REINTERPRET unspec > without a corresponding define_insn pattern. Won't that ICE? Would this case occur on big-endian targets? Looks like you are right. However, arm_mve.h is protected by: #if __ARM_BIG_ENDIAN #error "MVE intrinsics are not supported in Big-Endian mode." Just tried to hack my arm_mve.h to accept big-endian, and indeed we do ICE. In fact, this pattern and vreinterpretq_impl above are quite similar to the aarch64 implementation. I tried with a sample svint16_t foo(svint8_t value1) { returnsvreinterpret_s16_s8(value1); } and it seems aarch64-none-elf-gcc -march=armv8.2-a+sve -mbig-endian is OK, although aarch64_can_change_mode_class() has: if (BYTES_BIG_ENDIAN) ... if (from_sve_p && GET_MODE_UNIT_SIZE (from) != GET_MODE_UNIT_SIZE (to)) return false; so it should have a similar problem? I', not sure why it doesn't ICE? Thanks, Christophe > > Thanks, > Kyrill > >> diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md >> index 84384ee798d..dccda283573 100644 >> --- a/gcc/config/arm/unspecs.md >> +++ b/gcc/config/arm/unspecs.md >> @@ -1255,4 +1255,5 @@ (define_c_enum "unspec" [ >> SQRSHRL_64 >> SQRSHRL_48 >> VSHLCQ_M_ >> + REINTERPRET >> ])