From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua1-x930.google.com (mail-ua1-x930.google.com [IPv6:2607:f8b0:4864:20::930]) by sourceware.org (Postfix) with ESMTPS id AEF0C3858403; Wed, 15 Sep 2021 13:12:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AEF0C3858403 Received: by mail-ua1-x930.google.com with SMTP id c33so1643143uae.9; Wed, 15 Sep 2021 06:12:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0a1VE9Yv/hWwnF0Mdz64PeU88y65lwfhIuGO6Ttaw5g=; b=Z+YLfEahSZ/UqgGwCyeIuLYDl1yCkjNTg8L9HvcPMCeZjlPMYj1p26BrC7HiQQS/Xx F2xYCmkFx4aAiZC2wEKh/KHTqmkfB9BBAb/fSt8XYdt8NVEu/MXkOdJXXuNiaU0V4FaC sGSLz6nSPqDqu2B7BmUFE2C2xC2Nik7kXX7a6fPTUqCXm7Ec+Ze8w+RytyzUrW9pgW3e 4nIAUyPKMUQfn75qMRDAMW177TA6NSq2Uj3GzrhnYcfYJ2obtzBoCnriHyb3Z6oO5YWQ oL/5ORrHExfV4QNCddw8dhxjD0WFqOFgtGBgc/urrRVb5AsR3ZLXFu/r/9Lhiedlor9P Azfw== X-Gm-Message-State: AOAM531x2QHuE66r7h/aMxKk3B+783DVua5Opu5UXNpocspjd0Mv4Wlu GbHinc4RX6b253Bfu96apjeX15/AeQaQb+Tuxts= X-Google-Smtp-Source: ABdhPJxOJSAZea/KYdP41GG8YVGK6vlLiJOeiAviRY2uWHBcfL+quRcGi1B2dpewEAYyhOpMoTkY/3rEXFBv0+z95qE= X-Received: by 2002:ab0:4303:: with SMTP id k3mr7898460uak.141.1631711535010; Wed, 15 Sep 2021 06:12:15 -0700 (PDT) MIME-Version: 1.0 References: <20210430063258.2774866-1-luoxhu@linux.ibm.com> <20210513104931.GG10366@gate.crashing.org> <973f14d9-c626-89b4-753f-e5c613eef6f2@linux.ibm.com> <2b08faad-887f-73c4-2b84-75dd0feb51c7@linux.ibm.com> <8adb7d7f-44d6-7222-bb42-606092b140b0@linux.ibm.com> <2d6953e9-ddbd-edff-dc71-2a2f7f13afc3@linux.ibm.com> In-Reply-To: <2d6953e9-ddbd-edff-dc71-2a2f7f13afc3@linux.ibm.com> From: David Edelsohn Date: Wed, 15 Sep 2021 09:11:42 -0400 Message-ID: Subject: Re: Ping ^ 3: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613] To: Xionghu Luo Cc: Segher Boessenkool , Bill Schmidt , GCC Patches , linkw@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Sep 2021 13:12:17 -0000 Hi, Xionhu Should "altivec_vsel2" .. 3 .. 4 be "*altivec_vsel2", etc. because they are combiner patterns and never referenced by name? Only the first, named pattern is referenced by the builtin code. Other than that question / suggestion, this patch is okay. Please coordinate with Bill and his builtin patches. Thanks, David On Wed, Sep 15, 2021 at 3:50 AM Xionghu Luo wrote: > > Ping^3, thanks. > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html > > > On 2021/9/6 08:52, Xionghu Luo via Gcc-patches wrote: > > Ping^2, thanks. > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html > > > > On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote: > >> Gentle ping, thanks. > >> > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html > >> > >> > >> On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote: > >>> Hi, > >>> > >>> On 2021/5/13 18:49, Segher Boessenkool wrote: > >>>> Hi! > >>>> > >>>> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote: > >>>>> The vsel instruction is a bit-wise select instruction. Using an > >>>>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code > >>>>> being generated in the combine pass. Per element selection is a > >>>>> subset of per bit-wise selection,with the patch the pattern is > >>>>> written using bit operations. But there are 8 different patterns > >>>>> to define "op0 := (op1 & ~op3) | (op2 & op3)": > >>>>> > >>>>> (~op3&op1) | (op3&op2), > >>>>> (~op3&op1) | (op2&op3), > >>>>> (op3&op2) | (~op3&op1), > >>>>> (op2&op3) | (~op3&op1), > >>>>> (op1&~op3) | (op3&op2), > >>>>> (op1&~op3) | (op2&op3), > >>>>> (op3&op2) | (op1&~op3), > >>>>> (op2&op3) | (op1&~op3), > >>>>> > >>>>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative > >>>>> canonical, which could reduce it to the FIRST 4 patterns, but it won't > >>>>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch > >>>>> handles it with two patterns with different NOT op3 position and check > >>>>> equality inside it. > >>>> > >>>> Yup, that latter case does not have canonicalisation rules. Btw, not > >>>> only combine does this canonicalisation: everything should, > >>>> non-canonical RTL is invalid RTL (in the instruction stream, you can do > >>>> everything in temporary code of course, as long as the RTL isn't > >>>> malformed). > >>>> > >>>>> -(define_insn "*altivec_vsel" > >>>>> +(define_insn "altivec_vsel" > >>>>> [(set (match_operand:VM 0 "altivec_register_operand" "=v") > >>>>> - (if_then_else:VM > >>>>> - (ne:CC (match_operand:VM 1 "altivec_register_operand" "v") > >>>>> - (match_operand:VM 4 "zero_constant" "")) > >>>>> - (match_operand:VM 2 "altivec_register_operand" "v") > >>>>> - (match_operand:VM 3 "altivec_register_operand" "v")))] > >>>>> - "VECTOR_MEM_ALTIVEC_P (mode)" > >>>>> - "vsel %0,%3,%2,%1" > >>>>> + (ior:VM > >>>>> + (and:VM > >>>>> + (not:VM (match_operand:VM 3 "altivec_register_operand" "v")) > >>>>> + (match_operand:VM 1 "altivec_register_operand" "v")) > >>>>> + (and:VM > >>>>> + (match_operand:VM 2 "altivec_register_operand" "v") > >>>>> + (match_operand:VM 4 "altivec_register_operand" "v"))))] > >>>>> + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) > >>>>> + && (rtx_equal_p (operands[2], operands[3]) > >>>>> + || rtx_equal_p (operands[4], operands[3]))" > >>>>> + { > >>>>> + if (rtx_equal_p (operands[2], operands[3])) > >>>>> + return "vsel %0,%1,%4,%3"; > >>>>> + else > >>>>> + return "vsel %0,%1,%2,%3"; > >>>>> + } > >>>>> [(set_attr "type" "vecmove")]) > >>>> > >>>> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I > >>>> think. So please write this as two patterns (and keep the expand if > >>>> that helps). > >>> > >>> I was a bit concerned that there would be a lot of duplicate code if we > >>> write two patterns for each vsel, totally 4 similar patterns in > >>> altivec.md and another 4 in vsx.md make it difficult to maintain, > >>> however > >>> I updated it since you prefer this way, as you pointed out the xxsel in > >>> vsx.md could be folded by later patch. > >>> > >>>> > >>>>> +(define_insn "altivec_vsel2" > >>>> > >>>> (same here of course). > >>>> > >>>>> ;; Fused multiply add. > >>>>> diff --git a/gcc/config/rs6000/rs6000-call.c > >>>>> b/gcc/config/rs6000/rs6000-call.c > >>>>> index f5676255387..d65bdc01055 100644 > >>>>> --- a/gcc/config/rs6000/rs6000-call.c > >>>>> +++ b/gcc/config/rs6000/rs6000-call.c > >>>>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types > >>>>> altivec_overloaded_builtins[] = { > >>>>> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, > >>>>> RS6000_BTI_unsigned_V2DI }, > >>>>> { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI, > >>>>> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, > >>>>> RS6000_BTI_V2DI }, > >>>>> - { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI, > >>>>> + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS, > >>>> > >>>> Are the _uns things still used for anything? But, let's not change > >>>> this until Bill's stuff is in :-) > >>>> > >>>> Why do you want to change this here, btw? I don't understand. > >>> > >>> OK, they are actually "unsigned type" overload builtin functions, change > >>> it or not so far won't cause functionality issue, I will revert this > >>> change > >>> in the updated patch. > >>> > >>>> > >>>>> + if (target == 0 > >>>>> + || GET_MODE (target) != tmode > >>>>> + || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) > >>>> > >>>> No space after ! and other unary operators (except for casts and other > >>>> operators you write with alphanumerics, like "sizeof"). I know you > >>>> copied this code, but :-) > >>> > >>> OK, thanks. > >>> > >>>> > >>>>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx > >>>>> op_true, rtx op_false, > >>>>> case GEU: > >>>>> case LTU: > >>>>> case LEU: > >>>>> - /* Mark unsigned tests with CCUNSmode. */ > >>>>> - cc_mode = CCUNSmode; > >>>>> /* Invert condition to avoid compound test if necessary. */ > >>>>> if (rcode == GEU || rcode == LEU) > >>>> > >>>> So this is related to the _uns thing. Could you split off that change? > >>>> Probably as an earlier patch (but either works for me). > >>> > >>> Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously > >>> cc_mode > >>> is a parameter to generate the condition for IF_THEN_ELSE > >>> instruction, now > >>> we don't need it again as we use IOR (AND... AND...) style, remove it > >>> to avoid > >>> build error. > >>> > >>> > >>> - cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask), > >>> - CONST0_RTX (dest_mode)); > >>> - emit_insn (gen_rtx_SET (dest, > >>> - gen_rtx_IF_THEN_ELSE (dest_mode, > >>> - cond2, > >>> - op_true, > >>> - op_false))); > >>> + rtx tmp = gen_rtx_IOR (dest_mode, > >>> + gen_rtx_AND (dest_mode, gen_rtx_NOT > >>> (dest_mode, mask), > >>> + op_false), > >>> + gen_rtx_AND (dest_mode, mask, op_true)); > >>> + emit_insn (gen_rtx_SET (dest, tmp)); > >>> > >>> > >>>> > >>>>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx > >>>>> op_true, rtx op_false, > >>>>> if (!mask) > >>>>> return 0; > >>>>> + if (mask_mode != dest_mode) > >>>>> + mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0); > >>>> > >>>> Indent just two characters please: line continuations (usually) align, > >>>> but indents do not.> > >>>> Can you fold vsel and xxsel together completely? They have exactly the > >>>> same semantics! This does not have to be in this patch of course. > >>> > >>> I noticed that vperm/xxperm are folded together, do you mean fold > >>> vsel/xxsel > >>> like them? It's attached as: > >>> 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch > >>> > >>> > >>> Thanks, > >>> Xionghu > >> > > > > -- > Thanks, > Xionghu