Yeah. Thanks. I have sent V14: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619478.html which I found there is no distinction between SLP and non-SLP. Could you review it? I think it's more reasonable now. Thanks. juzhe.zhong@rivai.ai From: Richard Sandiford Date: 2023-05-24 22:57 To: 钟居哲 CC: gcc-patches; rguenther Subject: Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support 钟居哲 writes: >>> Both approaches are fine. I'm not against one or the other. > >>> What I didn't understand was why your patch only reuses existing IVs >>> for max_nscalars_per_iter == 1. Was it to avoid having to do a >>> multiplication (well, really a shift left) when moving from one >>> rgroup to another? E.g. if one rgroup had; > >>> nscalars_per_iter == 2 && factor == 1 > >>> and another had: > >>> nscalars_per_iter == 4 && factor == 1 > >>> then we would need to mulitply by 2 when going from the first rgroup >>> to the second. > >>> If so, avoiding a multiplication seems like a good reason for the choice >>> you were making in the path. But we then need to check >>> max_nscalars_per_iter == 1 for both the source rgroup and the >>> destination rgroup, not just the destination. And I think the >>> condition for “no multiplication needed” should be that: > > Oh, I didn't realize such complicated problem. Frankly, I didn't understand well > rgroup. Sorry about that :). > > I just remember last time you said I need to handle multiple-rgroup > not only for SLP but also non-SLP (which is vec_pack_trunk that I tested). > Then I asked you when is non-SLP, you said max_nscalars_per_iter == 1. Yeah, max_nscalars_per_iter == 1 is the right way of checking for non-SLP. But I'm never been convinced that SLP vs. non-SLP is a meaningful distinction for this patch (that is, the parts that don't use SELECT_VL). SLP vs. non-SLP matters for SELECT_VL. But the rgroup abstraction should mean that SLP vs. non-SLP doesn't matter otherwise. Thanks, Richard