Hi Richard, For the test-case mentioned in PR111702, compiling with -O2 -frounding-math -fstack-protector-all results in following ICE during cse2 pass: test.c: In function 'foo': test.c:119:1: internal compiler error: in insert_regs, at cse.cc:1120 119 | } | ^ 0xb7ebb0 insert_regs ../../gcc/gcc/cse.cc:1120 0x1f95134 merge_equiv_classes ../../gcc/gcc/cse.cc:1764 0x1f9b9ab cse_insn ../../gcc/gcc/cse.cc:4793 0x1f9fe30 cse_extended_basic_block ../../gcc/gcc/cse.cc:6577 0x1f9fe30 cse_main ../../gcc/gcc/cse.cc:6722 0x1fa0984 rest_of_handle_cse2 ../../gcc/gcc/cse.cc:7620 0x1fa0984 execute ../../gcc/gcc/cse.cc:7675 This happens only with interleave+zip1 vector initialization with -frounding-math -fstack-protector-all, while it compiles OK without -fstack-protector-all. Also, it compiles OK with fallback sequence code-gen (with or without -fstack-protector-all). Unfortunately, I haven't been able to reduce the test-case further :/ From the test-case, it seems only the vector initializer for type J uses interleave+zip1 approach, while rest of the vector initializers use fallback sequence. J is defined as: typedef _Float16 __attribute__((__vector_size__ (16))) J; and the initializer is: (J) { 11654, 4801, 5535, 9743, 61680} interleave+zip1 sequence for above initializer J: mode = V8HF vals: (parallel:V8HF [ (reg:HF 642) (reg:HF 645) (reg:HF 648) (reg:HF 651) (reg:HF 654) (const_double:HF 0.0 [0x0.0p+0]) repeated x3 ]) target: (reg:V8HF 641) seq: (insn 1058 0 1059 (set (reg:V4HF 657) (const_vector:V4HF [ (const_double:HF 0.0 [0x0.0p+0]) repeated x4 ])) "test.c":81:8 -1 (nil)) (insn 1059 1058 1060 (set (reg:V4HF 657) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 642)) (reg:V4HF 657) (const_int 1 [0x1]))) "test.c":81:8 -1 (nil)) (insn 1060 1059 1061 (set (reg:V4HF 657) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 648)) (reg:V4HF 657) (const_int 2 [0x2]))) "test.c":81:8 -1 (nil)) (insn 1061 1060 1062 (set (reg:V4HF 657) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 654)) (reg:V4HF 657) (const_int 4 [0x4]))) "test.c":81:8 -1 (nil)) (insn 1062 1061 1063 (set (reg:V4HF 658) (const_vector:V4HF [ (const_double:HF 0.0 [0x0.0p+0]) repeated x4 ])) "test.c":81:8 -1 (nil)) (insn 1063 1062 1064 (set (reg:V4HF 658) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 645)) (reg:V4HF 658) (const_int 1 [0x1]))) "test.c":81:8 -1 (nil)) (insn 1064 1063 1065 (set (reg:V4HF 658) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 651)) (reg:V4HF 658) (const_int 2 [0x2]))) "test.c":81:8 -1 (nil)) (insn 1065 1064 0 (set (reg:V8HF 641) (unspec:V8HF [ (subreg:V8HF (reg:V4HF 657) 0) (subreg:V8HF (reg:V4HF 658) 0) ] UNSPEC_ZIP1)) "test.c":81:8 -1 (nil)) It seems to me that the above sequence correctly initializes the vector into r641 ? insns 1058-1061 construct r657 = { r642, r648, r654, 0 } insns 1062-1064 construct r658 = { r645, r651, 0, 0 } and zip1 will create r641 = { r642, r645, r648, r651, r654, 0, 0, 0 } For the above test, it seems that with interleave+zip1 approach and -fstack-protector-all, in cse pass, there are two separate equivalence classes created for (const_int 1), that need to be merged in cse_insn: if (elt->first_same_value != src_eqv_elt->first_same_value) { /* The REG_EQUAL is indicating that two formerly distinct classes are now equivalent. So merge them. */ merge_equiv_classes (elt, src_eqv_elt); elt equivalence chain: Equivalence chain for (subreg:QI (reg:V16QI 671) 0): (subreg:QI (reg:V16QI 671) 0) (const_int 1 [0x1]) src_eqv_elt equivalence chain: Equivalence chain for (const_int 1 [0x1]): (reg:QI 34 v2) (reg:QI 32 v0) (reg:QI 34 v2) (const_int 1 [0x1]) (vec_select:QI (reg:V16QI 671) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 32 v0) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 2 [0x2]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 1 [0x1]) ])) The issue is that merge_equiv_classes doesn't seem to deal correctly with multiple occurences of same register in class2 (src_eqv_elt), which has two occurrences of (reg:QI 34 v2) In merge_equiv_classes, on first iteration, it will remove (reg:QI 34) from reg_equiv_table by calling delete_equiv_reg(34), and in insert_regs it will create an entry for (reg:QI 34) in qty_table with new quantity number, and create new equivalence in reg_eqv_table. When we again come across (reg:QI 34) in class2, it will unconditionally remove the register from reg_eqv_table, thus making REG_QTY(34) = -35, even tho (reg:QI 34) is now present in class1 chain. Then in insert_regs, we have: x: (reg:QI 34 v2) classp: (subreg:QI (reg:V16QI 671) 0) (reg:QI 34 v2) (const_int 1 [0x1]) And while iterating over elements in classp, we end up with regno == c_regno == 34. However, as mentioned above, merge_equiv_classes has deleted entry for (reg:QI 34) from reg_eqv_table, so it's no longer valid, and thus end up hitting the following assert: gcc_assert (REGNO_QTY_VALID_P (c_regno)); I am not sure tho why this is triggered only with interleave+zip1 approach with -fstack-protector-all. The attached (untested) patch is a workaround for the above issue -- In merge_equiv_classes, while iterating over elements in class2, it simply checks if element is a reg, and already inserted in class1 with equivalent mode, and avoids deleting it from reg_eqv_table in that case. This avoids hitting the assert, and following is the result of merging above two classes: Equivalence chain for (subreg:QI (reg:V16QI 671) 0): (subreg:QI (reg:V16QI 671) 0) (reg:QI 34 v2) (reg:QI 32 v0) (reg:QI 34 v2) (const_int 1 [0x1]) (const_int 1 [0x1]) (vec_select:QI (reg:V16QI 671) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 2 [0x2]) ])) (vec_select:QI (reg:V16QI 32 v0) (parallel [ (const_int 1 [0x1]) ])) Which seems to be OK (?), but am not sure if this patch is in the right direction, and is also not efficient. Could you please suggest how to proceed ? Thanks, Prathamesh