Re: Ping ^ 3: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Xionghu Luo <luoxhu@linux.ibm.com>
To: David Edelsohn <dje.gcc@gmail.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>,
	Bill Schmidt <wschmidt@linux.ibm.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
	linkw@gcc.gnu.org
Subject: Re: Ping ^ 3: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
Date: Fri, 17 Sep 2021 13:43:56 +0800	[thread overview]
Message-ID: <6ceb35dd-71ff-c8bc-eb46-fbd20f9a1663@linux.ibm.com> (raw)
In-Reply-To: <CAGWvny=Gkc_XAsQ-YJNFrUTJbHJ_gRZ4-9b2mJ6p1OaNhtdKLw@mail.gmail.com>



On 2021/9/15 21:11, David Edelsohn wrote:
> Hi, Xionhu
> 
> Should "altivec_vsel<mode>2" .. 3 .. 4 be "*altivec_vsel<mode>2", etc.
> because they are combiner patterns and never referenced by name?  Only
> the first, named pattern is referenced by the builtin code.

Thanks, updated the patchset with Segher's review comments, he didn't mention
about this and sorry to forget change this part,  I am also not
sure whether "altivec_vsel<mode>2" .. 3 .. 4 will be used/generated or
optimized by expander in future, is there any benefit to add "*" to the
define_insn patterns?

> 
> Other than that question / suggestion, this patch is okay.  Please
> coordinate with Bill and his builtin patches.

OK.

> 
> Thanks, David
> 
> On Wed, Sep 15, 2021 at 3:50 AM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>>
>> Ping^3, thanks.
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>>
>>
>> On 2021/9/6 08:52, Xionghu Luo via Gcc-patches wrote:
>>> Ping^2, thanks.
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>>>
>>> On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote:
>>>> Gentle ping, thanks.
>>>>
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>>>>
>>>>
>>>> On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote:
>>>>> Hi,
>>>>>
>>>>> On 2021/5/13 18:49, Segher Boessenkool wrote:
>>>>>> Hi!
>>>>>>
>>>>>> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
>>>>>>> The vsel instruction is a bit-wise select instruction.  Using an
>>>>>>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
>>>>>>> being generated in the combine pass.  Per element selection is a
>>>>>>> subset of per bit-wise selection,with the patch the pattern is
>>>>>>> written using bit operations.  But there are 8 different patterns
>>>>>>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
>>>>>>>
>>>>>>> (~op3&op1) | (op3&op2),
>>>>>>> (~op3&op1) | (op2&op3),
>>>>>>> (op3&op2) | (~op3&op1),
>>>>>>> (op2&op3) | (~op3&op1),
>>>>>>> (op1&~op3) | (op3&op2),
>>>>>>> (op1&~op3) | (op2&op3),
>>>>>>> (op3&op2) | (op1&~op3),
>>>>>>> (op2&op3) | (op1&~op3),
>>>>>>>
>>>>>>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
>>>>>>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
>>>>>>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
>>>>>>> handles it with two patterns with different NOT op3 position and check
>>>>>>> equality inside it.
>>>>>>
>>>>>> Yup, that latter case does not have canonicalisation rules.  Btw, not
>>>>>> only combine does this canonicalisation: everything should,
>>>>>> non-canonical RTL is invalid RTL (in the instruction stream, you can do
>>>>>> everything in temporary code of course, as long as the RTL isn't
>>>>>> malformed).
>>>>>>
>>>>>>> -(define_insn "*altivec_vsel<mode>"
>>>>>>> +(define_insn "altivec_vsel<mode>"
>>>>>>>      [(set (match_operand:VM 0 "altivec_register_operand" "=v")
>>>>>>> -    (if_then_else:VM
>>>>>>> -     (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
>>>>>>> -        (match_operand:VM 4 "zero_constant" ""))
>>>>>>> -     (match_operand:VM 2 "altivec_register_operand" "v")
>>>>>>> -     (match_operand:VM 3 "altivec_register_operand" "v")))]
>>>>>>> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
>>>>>>> -  "vsel %0,%3,%2,%1"
>>>>>>> +    (ior:VM
>>>>>>> +     (and:VM
>>>>>>> +      (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
>>>>>>> +      (match_operand:VM 1 "altivec_register_operand" "v"))
>>>>>>> +     (and:VM
>>>>>>> +      (match_operand:VM 2 "altivec_register_operand" "v")
>>>>>>> +      (match_operand:VM 4 "altivec_register_operand" "v"))))]
>>>>>>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>>>>>>> +  && (rtx_equal_p (operands[2], operands[3])
>>>>>>> +  || rtx_equal_p (operands[4], operands[3]))"
>>>>>>> +  {
>>>>>>> +    if (rtx_equal_p (operands[2], operands[3]))
>>>>>>> +      return "vsel %0,%1,%4,%3";
>>>>>>> +    else
>>>>>>> +      return "vsel %0,%1,%2,%3";
>>>>>>> +  }
>>>>>>>      [(set_attr "type" "vecmove")])
>>>>>>
>>>>>> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
>>>>>> think.  So please write this as two patterns (and keep the expand if
>>>>>> that helps).
>>>>>
>>>>> I was a bit concerned that there would be a lot of duplicate code if we
>>>>> write two patterns for each vsel, totally 4 similar patterns in
>>>>> altivec.md and another 4 in vsx.md make it difficult to maintain,
>>>>> however
>>>>> I updated it since you prefer this way, as you pointed out the xxsel in
>>>>> vsx.md could be folded by later patch.
>>>>>
>>>>>>
>>>>>>> +(define_insn "altivec_vsel<mode>2"
>>>>>>
>>>>>> (same here of course).
>>>>>>
>>>>>>>    ;; Fused multiply add.
>>>>>>> diff --git a/gcc/config/rs6000/rs6000-call.c
>>>>>>> b/gcc/config/rs6000/rs6000-call.c
>>>>>>> index f5676255387..d65bdc01055 100644
>>>>>>> --- a/gcc/config/rs6000/rs6000-call.c
>>>>>>> +++ b/gcc/config/rs6000/rs6000-call.c
>>>>>>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types
>>>>>>> altivec_overloaded_builtins[] = {
>>>>>>>        RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI,
>>>>>>> RS6000_BTI_unsigned_V2DI },
>>>>>>>      { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>>>>>        RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI,
>>>>>>> RS6000_BTI_V2DI },
>>>>>>> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>>>>> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>>>>>>
>>>>>> Are the _uns things still used for anything?  But, let's not change
>>>>>> this until Bill's stuff is in :-)
>>>>>>
>>>>>> Why do you want to change this here, btw?  I don't understand.
>>>>>
>>>>> OK, they are actually "unsigned type" overload builtin functions, change
>>>>> it or not so far won't cause functionality issue, I will revert this
>>>>> change
>>>>> in the updated patch.
>>>>>
>>>>>>
>>>>>>> +  if (target == 0
>>>>>>> +      || GET_MODE (target) != tmode
>>>>>>> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
>>>>>>
>>>>>> No space after ! and other unary operators (except for casts and other
>>>>>> operators you write with alphanumerics, like "sizeof").  I know you
>>>>>> copied this code, but :-)
>>>>>
>>>>> OK, thanks.
>>>>>
>>>>>>
>>>>>>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
>>>>>>> op_true, rtx op_false,
>>>>>>>        case GEU:
>>>>>>>        case LTU:
>>>>>>>        case LEU:
>>>>>>> -      /* Mark unsigned tests with CCUNSmode.  */
>>>>>>> -      cc_mode = CCUNSmode;
>>>>>>>          /* Invert condition to avoid compound test if necessary.  */
>>>>>>>          if (rcode == GEU || rcode == LEU)
>>>>>>
>>>>>> So this is related to the _uns thing.  Could you split off that change?
>>>>>> Probably as an earlier patch (but either works for me).
>>>>>
>>>>> Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously
>>>>> cc_mode
>>>>> is a parameter to generate the condition for IF_THEN_ELSE
>>>>> instruction, now
>>>>> we don't need it again as we use IOR (AND... AND...) style, remove it
>>>>> to avoid
>>>>> build error.
>>>>>
>>>>>
>>>>> -  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
>>>>> -                         CONST0_RTX (dest_mode));
>>>>> -  emit_insn (gen_rtx_SET (dest,
>>>>> -                         gen_rtx_IF_THEN_ELSE (dest_mode,
>>>>> -                                               cond2,
>>>>> -                                               op_true,
>>>>> -                                               op_false)));
>>>>> +  rtx tmp = gen_rtx_IOR (dest_mode,
>>>>> +                        gen_rtx_AND (dest_mode, gen_rtx_NOT
>>>>> (dest_mode, mask),
>>>>> +                                     op_false),
>>>>> +                        gen_rtx_AND (dest_mode, mask, op_true));
>>>>> +  emit_insn (gen_rtx_SET (dest, tmp));
>>>>>
>>>>>
>>>>>>
>>>>>>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
>>>>>>> op_true, rtx op_false,
>>>>>>>      if (!mask)
>>>>>>>        return 0;
>>>>>>> +  if (mask_mode != dest_mode)
>>>>>>> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
>>>>>>
>>>>>> Indent just two characters please: line continuations (usually) align,
>>>>>> but indents do not.>
>>>>>> Can you fold vsel and xxsel together completely?  They have exactly the
>>>>>> same semantics!  This does not have to be in this patch of course.
>>>>>
>>>>> I noticed that vperm/xxperm are folded together, do you mean fold
>>>>> vsel/xxsel
>>>>> like them?  It's attached as:
>>>>> 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Xionghu
>>>>
>>>
>>
>> --
>> Thanks,
>> Xionghu

-- 
Thanks,
Xionghu

next prev parent reply	other threads:[~2021-09-17  5:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-30  6:32 Xionghu Luo
2021-05-13  1:18 ` *Ping*: " Xionghu Luo
2021-05-13 10:49 ` Segher Boessenkool
2021-05-14  6:57   ` Xionghu Luo
2021-06-07  2:15     ` Ping: " Xionghu Luo
2021-06-30  1:42     ` Xionghu Luo
2021-09-06  0:52       ` Ping ^ 2: " Xionghu Luo
2021-09-15  7:50         ` Ping ^ 3: " Xionghu Luo
2021-09-15 13:11           ` David Edelsohn
2021-09-17  5:43             ` Xionghu Luo [this message]
2021-09-15 14:14     ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ceb35dd-71ff-c8bc-eb46-fbd20f9a1663@linux.ibm.com \
    --to=luoxhu@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linkw@gcc.gnu.org \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).