Re: [PATCH 1/7] arm: Auto-vectorization for MVE: vand

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
To: Christophe Lyon <christophe.lyon@linaro.org>
Cc: gcc Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH 1/7] arm: Auto-vectorization for MVE: vand
Date: Fri, 27 Nov 2020 14:13:49 +0000	[thread overview]
Message-ID: <d5b943f8-cf87-1dd6-5ba7-7e4ba3309a5b@arm.com> (raw)
In-Reply-To: <CAKdteOaNf1bEnPwdmyCwV4ehFJ=GzpKaM5=510KMUEyTZ6Devg@mail.gmail.com>

Hi Christophe,

On 26/11/2020 15:31, Christophe Lyon wrote:
> Hi Andre,
>
> Thanks for the quick feedback.
>
> On Wed, 25 Nov 2020 at 18:17, Andre Simoes Dias Vieira
> <andre.simoesdiasvieira@arm.com> wrote:
>> Hi Christophe,
>>
>> Thanks for these! See some inline comments.
>>
>> On 25/11/2020 13:54, Christophe Lyon via Gcc-patches wrote:
>>> This patch enables MVE vandq instructions for auto-vectorization.  MVE
>>> vandq insns in mve.md are modified to use 'and' instead of unspec
>>> expression to support and<mode>3.  The and<mode>3 expander is added to
>>> vec-common.md
>>>
>>> 2020-11-12  Christophe Lyon  <christophe.lyon@linaro.org>
>>>
>>>        gcc/
>>>        * gcc/config/arm/iterators.md (supf): Remove VANDQ_S and VANDQ_U.
>>>        (VANQ): Remove.
>>>        * config/arm/mve.md (mve_vandq_u<mode>): New entry for vand
>>>        instruction using expression and.
>>>        (mve_vandq_s<mode>): New expander.
>>>        * config/arm/neon.md (and<mode>3): Renamed into and<mode>3_neon.
>>>        * config/arm/unspecs.md (VANDQ_S, VANDQ_U): Remove.
>>>        * config/arm/vec-common.md (and<mode>3): New expander.
>>>
>>>        gcc/testsuite/
>>>        * gcc.target/arm/simd/mve-vand.c: New test.
>>> ---
>>>    gcc/config/arm/iterators.md                  |  4 +---
>>>    gcc/config/arm/mve.md                        | 20 ++++++++++++----
>>>    gcc/config/arm/neon.md                       |  2 +-
>>>    gcc/config/arm/unspecs.md                    |  2 --
>>>    gcc/config/arm/vec-common.md                 | 15 ++++++++++++
>>>    gcc/testsuite/gcc.target/arm/simd/mve-vand.c | 34 ++++++++++++++++++++++++++++
>>>    6 files changed, 66 insertions(+), 11 deletions(-)
>>>    create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vand.c
>>>
>>> diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
>>> index 592af35..72039e4 100644
>>> --- a/gcc/config/arm/iterators.md
>>> +++ b/gcc/config/arm/iterators.md
>>> @@ -1232,8 +1232,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
>>>                       (VADDLVQ_P_U "u") (VCMPNEQ_U "u") (VCMPNEQ_S "s")
>>>                       (VABDQ_M_S "s") (VABDQ_M_U "u") (VABDQ_S "s")
>>>                       (VABDQ_U "u") (VADDQ_N_S "s") (VADDQ_N_U "u")
>>> -                    (VADDVQ_P_S "s") (VADDVQ_P_U "u") (VANDQ_S "s")
>>> -                    (VANDQ_U "u") (VBICQ_S "s") (VBICQ_U "u")
>>> +                    (VADDVQ_P_S "s") (VADDVQ_P_U "u") (VBICQ_S "s") (VBICQ_U "u")
>>>                       (VBRSRQ_N_S "s") (VBRSRQ_N_U "u") (VCADDQ_ROT270_S "s")
>>>                       (VCADDQ_ROT270_U "u") (VCADDQ_ROT90_S "s")
>>>                       (VCMPEQQ_S "s") (VCMPEQQ_U "u") (VCADDQ_ROT90_U "u")
>>> @@ -1501,7 +1500,6 @@ (define_int_iterator VABDQ [VABDQ_S VABDQ_U])
>>>    (define_int_iterator VADDQ_N [VADDQ_N_S VADDQ_N_U])
>>>    (define_int_iterator VADDVAQ [VADDVAQ_S VADDVAQ_U])
>>>    (define_int_iterator VADDVQ_P [VADDVQ_P_U VADDVQ_P_S])
>>> -(define_int_iterator VANDQ [VANDQ_U VANDQ_S])
>>>    (define_int_iterator VBICQ [VBICQ_S VBICQ_U])
>>>    (define_int_iterator VBRSRQ_N [VBRSRQ_N_U VBRSRQ_N_S])
>>>    (define_int_iterator VCADDQ_ROT270 [VCADDQ_ROT270_S VCADDQ_ROT270_U])
>>> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
>>> index ecbaaa9..975eb7d 100644
>>> --- a/gcc/config/arm/mve.md
>>> +++ b/gcc/config/arm/mve.md
>>> @@ -894,17 +894,27 @@ (define_insn "mve_vaddvq_p_<supf><mode>"
>>>    ;;
>>>    ;; [vandq_u, vandq_s])
>>>    ;;
>>> -(define_insn "mve_vandq_<supf><mode>"
>>> +;; signed and unsigned versions are the same: define the unsigned
>>> +;; insn, and use an expander for the signed one as we still reference
>>> +;; both names from arm_mve.h.
>>> +(define_insn "mve_vandq_u<mode>"
>>>      [
>>>       (set (match_operand:MVE_2 0 "s_register_operand" "=w")
>>> -     (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")
>>> -                    (match_operand:MVE_2 2 "s_register_operand" "w")]
>>> -      VANDQ))
>>> +     (and:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w")
>>> +                    (match_operand:MVE_2 2 "s_register_operand" "w")))
>> The predicate on the second operand is more restrictive than the one in
>> expand 'neon_inv_logic_op2'. This should still work with immediates, or
>> well I checked for integers, it generates a loop as such:
>>
> Right, thanks for catching it.
>
>>           vldrw.32        q3, [r0]
>>           vldr.64 d4, .L8
>>           vldr.64 d5, .L8+8
>>           vand    q3, q3, q2
>>           vstrw.32        q3, [r2]
>>
>> MVE does support vand's with immediates, just like NEON, I suspect you
>> could just copy the way Neon handles these, possibly worth the little
>> extra effort. You can use dest[i] = a[i] & ~1 as a testcase.
>> If you don't it might still be worth expanding the test to make sure
>> other immediates-types combinations don't trigger an ICE?
>>
>> I'm not sure I understand why it loads it in two 64-bit chunks and not
>> do a single load or not just do something like a vmov or vbic immediate.
>> Anyhow that's a worry for another day I guess..
> Do you mean something like the attached (on top of this patch)?
> I dislike the code duplication in mve_vandq_u<mode> which would
> become a copy of and<mode>3_neon.
Hi Christophe,

Yeah that's what I meant. I agree with the code duplication. The reason 
we still use separate ones is because of the difference in supported 
modes. Maybe the right way around that would be to redefine VDQ as:

(define_mode_iterator VDQ [(V8QI "TARGET_HAVE_NEON") V16QI
                                                      (V4HI 
"TARGET_HAVE_NEON") V8HI
                                                      (V2SI 
"TARGET_HAVE_NEON") V4SI
                                                      (V4HF 
"TARGET_HAVE_NEON") V8HF
                                                      (V2SF 
"TARGET_HAVE_NEON") V4SF V2DI])

And have a single define_expand and insn for both vector extensions. 
Though we would also need to do something about the type attribut in 
case we want to have different scheduling types for both. Though right 
now we don't do any scheduling for MVE, I don't know whether these can 
be conditionalized on target features though, something to look at.

>
> The other concern is that it's not exercised by testcase: as you noted
> the compiler uses a pair of loads to prepare the second operand.
>
> But indeed that's probably a separate problem.
>
> I guess your comments apply to patch 2 (vorr)?

Yeah, forgot to reply to that one, but yes! I still need to have a look 
at 4-7.

Kind regards,
Andre

next prev parent reply	other threads:[~2020-11-27 14:13 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-25 13:54 Christophe Lyon
2020-11-25 13:54 ` [PATCH 2/7] arm: Auto-vectorization for MVE: vorr Christophe Lyon
2020-11-25 13:54 ` [PATCH 3/7] arm: Auto-vectorization for MVE: veor Christophe Lyon
2020-11-26 10:46   ` Andre Vieira (lists)
2020-11-25 13:54 ` [PATCH 4/7] arm: Auto-vectorization for MVE: vshl Christophe Lyon
2020-11-25 13:54 ` [PATCH 5/7] arm: Auto-vectorization for MVE: vshr Christophe Lyon
2020-11-25 13:54 ` [PATCH 6/7] arm: Auto-vectorization for MVE: vmvn Christophe Lyon
2020-11-25 17:17 ` [PATCH 1/7] arm: Auto-vectorization for MVE: vand Andre Simoes Dias Vieira
2020-11-26 15:31   ` Christophe Lyon
2020-11-27 14:13     ` Andre Vieira (lists) [this message]
2020-11-27 15:29       ` Christophe Lyon
2020-11-30 10:39         ` Christophe Lyon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d5b943f8-cf87-1dd6-5ba7-7e4ba3309a5b@arm.com \
    --to=andre.simoesdiasvieira@arm.com \
    --cc=christophe.lyon@linaro.org \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).