From: Xionghu Luo <luoxhu@linux.ibm.com>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: wschmidt@linux.ibm.com, gcc-patches@gcc.gnu.org,
linkw@gcc.gnu.org, dje.gcc@gmail.com
Subject: Re: Ping ^ 3: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
Date: Wed, 15 Sep 2021 15:50:21 +0800 [thread overview]
Message-ID: <2d6953e9-ddbd-edff-dc71-2a2f7f13afc3@linux.ibm.com> (raw)
In-Reply-To: <8adb7d7f-44d6-7222-bb42-606092b140b0@linux.ibm.com>
Ping^3, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
On 2021/9/6 08:52, Xionghu Luo via Gcc-patches wrote:
> Ping^2, thanks.
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>
> On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote:
>> Gentle ping, thanks.
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>>
>>
>> On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote:
>>> Hi,
>>>
>>> On 2021/5/13 18:49, Segher Boessenkool wrote:
>>>> Hi!
>>>>
>>>> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
>>>>> The vsel instruction is a bit-wise select instruction. Using an
>>>>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
>>>>> being generated in the combine pass. Per element selection is a
>>>>> subset of per bit-wise selection,with the patch the pattern is
>>>>> written using bit operations. But there are 8 different patterns
>>>>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
>>>>>
>>>>> (~op3&op1) | (op3&op2),
>>>>> (~op3&op1) | (op2&op3),
>>>>> (op3&op2) | (~op3&op1),
>>>>> (op2&op3) | (~op3&op1),
>>>>> (op1&~op3) | (op3&op2),
>>>>> (op1&~op3) | (op2&op3),
>>>>> (op3&op2) | (op1&~op3),
>>>>> (op2&op3) | (op1&~op3),
>>>>>
>>>>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
>>>>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
>>>>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
>>>>> handles it with two patterns with different NOT op3 position and check
>>>>> equality inside it.
>>>>
>>>> Yup, that latter case does not have canonicalisation rules. Btw, not
>>>> only combine does this canonicalisation: everything should,
>>>> non-canonical RTL is invalid RTL (in the instruction stream, you can do
>>>> everything in temporary code of course, as long as the RTL isn't
>>>> malformed).
>>>>
>>>>> -(define_insn "*altivec_vsel<mode>"
>>>>> +(define_insn "altivec_vsel<mode>"
>>>>> [(set (match_operand:VM 0 "altivec_register_operand" "=v")
>>>>> - (if_then_else:VM
>>>>> - (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
>>>>> - (match_operand:VM 4 "zero_constant" ""))
>>>>> - (match_operand:VM 2 "altivec_register_operand" "v")
>>>>> - (match_operand:VM 3 "altivec_register_operand" "v")))]
>>>>> - "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
>>>>> - "vsel %0,%3,%2,%1"
>>>>> + (ior:VM
>>>>> + (and:VM
>>>>> + (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
>>>>> + (match_operand:VM 1 "altivec_register_operand" "v"))
>>>>> + (and:VM
>>>>> + (match_operand:VM 2 "altivec_register_operand" "v")
>>>>> + (match_operand:VM 4 "altivec_register_operand" "v"))))]
>>>>> + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>>>>> + && (rtx_equal_p (operands[2], operands[3])
>>>>> + || rtx_equal_p (operands[4], operands[3]))"
>>>>> + {
>>>>> + if (rtx_equal_p (operands[2], operands[3]))
>>>>> + return "vsel %0,%1,%4,%3";
>>>>> + else
>>>>> + return "vsel %0,%1,%2,%3";
>>>>> + }
>>>>> [(set_attr "type" "vecmove")])
>>>>
>>>> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
>>>> think. So please write this as two patterns (and keep the expand if
>>>> that helps).
>>>
>>> I was a bit concerned that there would be a lot of duplicate code if we
>>> write two patterns for each vsel, totally 4 similar patterns in
>>> altivec.md and another 4 in vsx.md make it difficult to maintain,
>>> however
>>> I updated it since you prefer this way, as you pointed out the xxsel in
>>> vsx.md could be folded by later patch.
>>>
>>>>
>>>>> +(define_insn "altivec_vsel<mode>2"
>>>>
>>>> (same here of course).
>>>>
>>>>> ;; Fused multiply add.
>>>>> diff --git a/gcc/config/rs6000/rs6000-call.c
>>>>> b/gcc/config/rs6000/rs6000-call.c
>>>>> index f5676255387..d65bdc01055 100644
>>>>> --- a/gcc/config/rs6000/rs6000-call.c
>>>>> +++ b/gcc/config/rs6000/rs6000-call.c
>>>>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types
>>>>> altivec_overloaded_builtins[] = {
>>>>> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI,
>>>>> RS6000_BTI_unsigned_V2DI },
>>>>> { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>>> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI,
>>>>> RS6000_BTI_V2DI },
>>>>> - { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>>> + { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>>>>
>>>> Are the _uns things still used for anything? But, let's not change
>>>> this until Bill's stuff is in :-)
>>>>
>>>> Why do you want to change this here, btw? I don't understand.
>>>
>>> OK, they are actually "unsigned type" overload builtin functions, change
>>> it or not so far won't cause functionality issue, I will revert this
>>> change
>>> in the updated patch.
>>>
>>>>
>>>>> + if (target == 0
>>>>> + || GET_MODE (target) != tmode
>>>>> + || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
>>>>
>>>> No space after ! and other unary operators (except for casts and other
>>>> operators you write with alphanumerics, like "sizeof"). I know you
>>>> copied this code, but :-)
>>>
>>> OK, thanks.
>>>
>>>>
>>>>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
>>>>> op_true, rtx op_false,
>>>>> case GEU:
>>>>> case LTU:
>>>>> case LEU:
>>>>> - /* Mark unsigned tests with CCUNSmode. */
>>>>> - cc_mode = CCUNSmode;
>>>>> /* Invert condition to avoid compound test if necessary. */
>>>>> if (rcode == GEU || rcode == LEU)
>>>>
>>>> So this is related to the _uns thing. Could you split off that change?
>>>> Probably as an earlier patch (but either works for me).
>>>
>>> Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously
>>> cc_mode
>>> is a parameter to generate the condition for IF_THEN_ELSE
>>> instruction, now
>>> we don't need it again as we use IOR (AND... AND...) style, remove it
>>> to avoid
>>> build error.
>>>
>>>
>>> - cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
>>> - CONST0_RTX (dest_mode));
>>> - emit_insn (gen_rtx_SET (dest,
>>> - gen_rtx_IF_THEN_ELSE (dest_mode,
>>> - cond2,
>>> - op_true,
>>> - op_false)));
>>> + rtx tmp = gen_rtx_IOR (dest_mode,
>>> + gen_rtx_AND (dest_mode, gen_rtx_NOT
>>> (dest_mode, mask),
>>> + op_false),
>>> + gen_rtx_AND (dest_mode, mask, op_true));
>>> + emit_insn (gen_rtx_SET (dest, tmp));
>>>
>>>
>>>>
>>>>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
>>>>> op_true, rtx op_false,
>>>>> if (!mask)
>>>>> return 0;
>>>>> + if (mask_mode != dest_mode)
>>>>> + mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
>>>>
>>>> Indent just two characters please: line continuations (usually) align,
>>>> but indents do not.>
>>>> Can you fold vsel and xxsel together completely? They have exactly the
>>>> same semantics! This does not have to be in this patch of course.
>>>
>>> I noticed that vperm/xxperm are folded together, do you mean fold
>>> vsel/xxsel
>>> like them? It's attached as:
>>> 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
>>>
>>>
>>> Thanks,
>>> Xionghu
>>
>
--
Thanks,
Xionghu
next prev parent reply other threads:[~2021-09-15 7:50 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-30 6:32 Xionghu Luo
2021-05-13 1:18 ` *Ping*: " Xionghu Luo
2021-05-13 10:49 ` Segher Boessenkool
2021-05-14 6:57 ` Xionghu Luo
2021-06-07 2:15 ` Ping: " Xionghu Luo
2021-06-30 1:42 ` Xionghu Luo
2021-09-06 0:52 ` Ping ^ 2: " Xionghu Luo
2021-09-15 7:50 ` Xionghu Luo [this message]
2021-09-15 13:11 ` Ping ^ 3: " David Edelsohn
2021-09-17 5:43 ` Xionghu Luo
2021-09-15 14:14 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2d6953e9-ddbd-edff-dc71-2a2f7f13afc3@linux.ibm.com \
--to=luoxhu@linux.ibm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=linkw@gcc.gnu.org \
--cc=segher@kernel.crashing.org \
--cc=wschmidt@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).