Sorry for late reply. I just got back from vacation (a week). I was planning to finish this patch after vacation. It seems that you almost finished. That's great! Thank you so much. juzhe.zhong@rivai.ai From: Richard Biener Date: 2022-10-07 20:24 To: juzhe.zhong CC: gcc-patches; richard.sandiford Subject: Re: [PATCH] Add first-order recurrence autovectorization On Thu, Oct 6, 2022 at 3:07 PM Richard Biener wrote: > > On Thu, Oct 6, 2022 at 2:13 PM Richard Biener > wrote: > > > > On Fri, Sep 30, 2022 at 10:00 AM wrote: > > > > > > From: Ju-Zhe Zhong > > > > > > Hi, After fixing previous ICE. > > > I add full implementation (insert permutation to get correct result.) > > > > > > The gimple IR is correct now I think: > > > # t_21 = PHI <_4(6), t_12(9)> > > > # i_22 = PHI > > > # vectp_a.6_26 = PHI > > > # vect_vec_recur_.9_9 = PHI > > > # vectp_b.11_7 = PHI > > > # curr_cnt_36 = PHI > > > # loop_len_20 = PHI > > > _38 = .WHILE_LEN (loop_len_20, 32, POLY_INT_CST [4, 4]); > > > while_len_37 = _38; > > > _1 = (long unsigned int) i_22; > > > _2 = _1 * 4; > > > _3 = a_14(D) + _2; > > > vect__4.8_19 = .LEN_LOAD (vectp_a.6_26, 32B, loop_len_20, 0); > > > _4 = *_3; > > > _5 = b_15(D) + _2; > > > vect_vec_recur_.9_9 = VEC_PERM_EXPR ; > > > > > > But I encounter another ICE: > > > 0x169e0e7 process_bb > > > ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:7498 > > > 0x16a09af do_rpo_vn(function*, edge_def*, bitmap_head*, bool, bool, vn_lookup_kind) > > > ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:8109 > > > 0x16a0fe7 do_rpo_vn(function*, edge_def*, bitmap_head*) > > > ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:8205 > > > 0x179b7db execute > > > ../../../riscv-gcc/gcc/tree-vectorizer.cc:1365 > > > > > > Could you help me with this? After fixing this ICE, I think the loop vectorizer > > > can run correctly. Maybe you can test is in X86 or ARM after fixing this ICE. > > > > Sorry for the late reply, the issue is that we have > > > > vect_vec_recur_.7_7 = VEC_PERM_EXPR > { 7, 8, 9, 10, 11, 12, 13, 14 }>; > > > > thus > > > > + for (unsigned i = 0; i < ncopies; ++i) > > + { > > + gphi *phi = as_a (STMT_VINFO_VEC_STMTS (def_stmt_info)[i]); > > + tree latch = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop)); > > + tree recur = gimple_phi_result (phi); > > + gassign *assign > > + = gimple_build_assign (recur, VEC_PERM_EXPR, recur, latch, perm); > > + gimple_assign_set_lhs (assign, recur); > > > > needs to create a new SSA name for each LHS. You shouldn't create code in > > vect_get_vec_defs_for_operand either. > > > > Let me mangle the patch a bit. > > > > The attached is what I came up with, the permutes need to be generated when > > the backedge PHI values are filled in. Missing are ncopies > 1 handling, we'd > > need to think of how the initial value and the permutes would work here, missing > > is SLP support but more importantly handling in the epilogue (so on x86 requires > > constant loop bound) > > I've added a testcase that triggers on x86_64. > > Actually I broke it, the following is more correct. So let me finish the patch. I have everything besides the epilogue handling done, I'll get to that somewhen next week. Richard. > Richard. > > > Richard.