Since RVV has much more types than aarch64. You can see rvv-intrinsic doc there are so many rvv intrinsics: https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/eopc/tuple-type-for-seg-load-store/auto-generated/intrinsic_funcs/02_vector_unit-stride_segment_load_store_instructions_zvlsseg.md The rvv intrinsics explode. For segment instructions, RVV has array type supporting NF from 2 ~ 8 for LMUL <= 1 (MF8,MF4,MF2,M1) Wheras aarch64 only has array type with array size 2 ~ 4 only for a LMUL = 1(a whole vector). I think, kito can explain more clearly about such issue. juzhe.zhong@rivai.ai From: Jeff Law Date: 2023-04-10 22:54 To: juzhe.zhong; gcc-patches CC: kito.cheng; palmer; jakub; richard.sandiford; rguenther Subject: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit On 4/10/23 08:48, juzhe.zhong@rivai.ai wrote: > From: Juzhe-Zhong > > According RVV ISA: > https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-type-register-vtype > We have LMUL: 1/8, 1/4, 1/2, 1, 2, 4, 8 > Also, for segment instructions, we have tuple type for NF = 2 ~ 8. > For example, for LMUL = 1/2, SEW = 32, we have vint32mf2_t, > we will have NF from 2 ~ 8 tuples: vint32mf2x2_t, vint32mf2x2... vint32mf2x8_t. > So we will end up with over 220+ vector machine mode for RVV. > > PLUS the scalar machine modes that we already have in RISC-V port. > > The total machine modes in RISC-V port > 256. > > Current GCC can not allow us support RVV segment instructions tuple types. > > So extend machine mode size from 8bit to 16bit. > > I have another solution related to this patch, > May be adding a target dependent macro is better? > Revise this patch like this: > > #ifdef TARGET_MAX_MACHINE_MODE_LARGER_THAN_256 > ENUM_BITFIELD(machine_mode) last_set_mode : 16; > #else > ENUM_BITFIELD(machine_mode) last_set_mode : 8; > #endif > > Not sure whether this solution is better? > > This patch Bootstraped on X86 is PASS. Will run make-check gcc-testsuite tomorrow. > > Expecting land in GCC-14, any suggestions ? > > gcc/ChangeLog: > > * combine.cc (struct reg_stat_type): Extend 8bit to 16bit. > * cse.cc (struct qty_table_elem): Ditto. > (struct table_elt): Ditto. > (struct set): Ditto. > * genopinit.cc (main): Ditto. > * ira-int.h (struct ira_allocno): Ditto. > * ree.cc (struct ATTRIBUTE_PACKED): Ditto. > * rtl-ssa/accesses.h: Ditto. > * rtl.h (struct GTY): Ditto. > (subreg_shape::unique_id): Ditto. > * rtlanal.h: Ditto. > * tree-core.h (struct tree_type_common): Ditto. > (struct tree_decl_common): Ditto. This is likely going to be very controversial. It's going to increase the size of two of most heavily used data structures in GCC (rtx and trees). The first thing I would ask is whether or not we really need the full matrix in practice or if we can combine some of the modes. Why hasn't aarch64 stumbled over this problem? Jeff