Hi, On 2021/5/13 18:49, Segher Boessenkool wrote: > Hi! > > On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote: >> The vsel instruction is a bit-wise select instruction. Using an >> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code >> being generated in the combine pass. Per element selection is a >> subset of per bit-wise selection,with the patch the pattern is >> written using bit operations. But there are 8 different patterns >> to define "op0 := (op1 & ~op3) | (op2 & op3)": >> >> (~op3&op1) | (op3&op2), >> (~op3&op1) | (op2&op3), >> (op3&op2) | (~op3&op1), >> (op2&op3) | (~op3&op1), >> (op1&~op3) | (op3&op2), >> (op1&~op3) | (op2&op3), >> (op3&op2) | (op1&~op3), >> (op2&op3) | (op1&~op3), >> >> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative >> canonical, which could reduce it to the FIRST 4 patterns, but it won't >> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch >> handles it with two patterns with different NOT op3 position and check >> equality inside it. > > Yup, that latter case does not have canonicalisation rules. Btw, not > only combine does this canonicalisation: everything should, > non-canonical RTL is invalid RTL (in the instruction stream, you can do > everything in temporary code of course, as long as the RTL isn't > malformed). > >> -(define_insn "*altivec_vsel" >> +(define_insn "altivec_vsel" >> [(set (match_operand:VM 0 "altivec_register_operand" "=v") >> - (if_then_else:VM >> - (ne:CC (match_operand:VM 1 "altivec_register_operand" "v") >> - (match_operand:VM 4 "zero_constant" "")) >> - (match_operand:VM 2 "altivec_register_operand" "v") >> - (match_operand:VM 3 "altivec_register_operand" "v")))] >> - "VECTOR_MEM_ALTIVEC_P (mode)" >> - "vsel %0,%3,%2,%1" >> + (ior:VM >> + (and:VM >> + (not:VM (match_operand:VM 3 "altivec_register_operand" "v")) >> + (match_operand:VM 1 "altivec_register_operand" "v")) >> + (and:VM >> + (match_operand:VM 2 "altivec_register_operand" "v") >> + (match_operand:VM 4 "altivec_register_operand" "v"))))] >> + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) >> + && (rtx_equal_p (operands[2], operands[3]) >> + || rtx_equal_p (operands[4], operands[3]))" >> + { >> + if (rtx_equal_p (operands[2], operands[3])) >> + return "vsel %0,%1,%4,%3"; >> + else >> + return "vsel %0,%1,%2,%3"; >> + } >> [(set_attr "type" "vecmove")]) > > That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I > think. So please write this as two patterns (and keep the expand if > that helps). I was a bit concerned that there would be a lot of duplicate code if we write two patterns for each vsel, totally 4 similar patterns in altivec.md and another 4 in vsx.md make it difficult to maintain, however I updated it since you prefer this way, as you pointed out the xxsel in vsx.md could be folded by later patch. > >> +(define_insn "altivec_vsel2" > > (same here of course). > >> ;; Fused multiply add. >> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c >> index f5676255387..d65bdc01055 100644 >> --- a/gcc/config/rs6000/rs6000-call.c >> +++ b/gcc/config/rs6000/rs6000-call.c >> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { >> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI }, >> { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI, >> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, >> - { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI, >> + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS, > > Are the _uns things still used for anything? But, let's not change > this until Bill's stuff is in :-) > > Why do you want to change this here, btw? I don't understand. OK, they are actually "unsigned type" overload builtin functions, change it or not so far won't cause functionality issue, I will revert this change in the updated patch. > >> + if (target == 0 >> + || GET_MODE (target) != tmode >> + || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) > > No space after ! and other unary operators (except for casts and other > operators you write with alphanumerics, like "sizeof"). I know you > copied this code, but :-) OK, thanks. > >> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false, >> case GEU: >> case LTU: >> case LEU: >> - /* Mark unsigned tests with CCUNSmode. */ >> - cc_mode = CCUNSmode; >> >> /* Invert condition to avoid compound test if necessary. */ >> if (rcode == GEU || rcode == LEU) > > So this is related to the _uns thing. Could you split off that change? > Probably as an earlier patch (but either works for me). Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously cc_mode is a parameter to generate the condition for IF_THEN_ELSE instruction, now we don't need it again as we use IOR (AND... AND...) style, remove it to avoid build error. - cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask), - CONST0_RTX (dest_mode)); - emit_insn (gen_rtx_SET (dest, - gen_rtx_IF_THEN_ELSE (dest_mode, - cond2, - op_true, - op_false))); + rtx tmp = gen_rtx_IOR (dest_mode, + gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, mask), + op_false), + gen_rtx_AND (dest_mode, mask, op_true)); + emit_insn (gen_rtx_SET (dest, tmp)); > >> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false, >> if (!mask) >> return 0; >> >> + if (mask_mode != dest_mode) >> + mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0); > > Indent just two characters please: line continuations (usually) align, > but indents do not.> > > Can you fold vsel and xxsel together completely? They have exactly the > same semantics! This does not have to be in this patch of course. I noticed that vperm/xxperm are folded together, do you mean fold vsel/xxsel like them? It's attached as: 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch Thanks, Xionghu