From: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
To: Christophe Lyon <christophe.lyon.oss@gmail.com>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>,
gcc Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [ARM] PR66791: Replace builtins for vdup_n and vmov_n intrinsics
Date: Fri, 3 Sep 2021 14:05:48 +0530 [thread overview]
Message-ID: <CAAgBjMkBru_aMTwZv_Be89x6NV0JM0+tAYz-uD9pgwOXwF+g1w@mail.gmail.com> (raw)
In-Reply-To: <CAKhMtS+XaHAWJm8HUAbdndTaqFky5uROj4G4Y+gnPxxGQewMXA@mail.gmail.com>
On Thu, 2 Sept 2021 at 14:32, Christophe Lyon
<christophe.lyon.oss@gmail.com> wrote:
>
>
>
> On Tue, Aug 24, 2021 at 10:17 AM Kyrylo Tkachov <Kyrylo.Tkachov@arm.com> wrote:
>>
>>
>>
>> > -----Original Message-----
>> > From: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
>> > Sent: 24 August 2021 09:01
>> > To: Christophe Lyon <christophe.lyon.oss@gmail.com>
>> > Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>; gcc Patches <gcc-
>> > patches@gcc.gnu.org>
>> > Subject: Re: [ARM] PR66791: Replace builtins for vdup_n and vmov_n
>> > intrinsics
>> >
>> > On Tue, 17 Aug 2021 at 11:55, Prathamesh Kulkarni
>> > <prathamesh.kulkarni@linaro.org> wrote:
>> > >
>> > > On Thu, 12 Aug 2021 at 19:04, Christophe Lyon
>> > > <christophe.lyon.oss@gmail.com> wrote:
>> > > >
>> > > >
>> > > >
>> > > > On Thu, Aug 12, 2021 at 1:54 PM Prathamesh Kulkarni
>> > <prathamesh.kulkarni@linaro.org> wrote:
>> > > >>
>> > > >> On Wed, 11 Aug 2021 at 22:23, Christophe Lyon
>> > > >> <christophe.lyon.oss@gmail.com> wrote:
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > On Thu, Jun 24, 2021 at 6:29 PM Kyrylo Tkachov via Gcc-patches <gcc-
>> > patches@gcc.gnu.org> wrote:
>> > > >> >>
>> > > >> >>
>> > > >> >>
>> > > >> >> > -----Original Message-----
>> > > >> >> > From: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
>> > > >> >> > Sent: 24 June 2021 12:11
>> > > >> >> > To: gcc Patches <gcc-patches@gcc.gnu.org>; Kyrylo Tkachov
>> > > >> >> > <Kyrylo.Tkachov@arm.com>
>> > > >> >> > Subject: [ARM] PR66791: Replace builtins for vdup_n and vmov_n
>> > intrinsics
>> > > >> >> >
>> > > >> >> > Hi,
>> > > >> >> > This patch replaces builtins for vdup_n and vmov_n.
>> > > >> >> > The patch results in regression for pr51534.c.
>> > > >> >> > Consider following function:
>> > > >> >> >
>> > > >> >> > uint8x8_t f1 (uint8x8_t a) {
>> > > >> >> > return vcgt_u8(a, vdup_n_u8(0));
>> > > >> >> > }
>> > > >> >> >
>> > > >> >> > code-gen before patch:
>> > > >> >> > f1:
>> > > >> >> > vmov.i32 d16, #0 @ v8qi
>> > > >> >> > vcgt.u8 d0, d0, d16
>> > > >> >> > bx lr
>> > > >> >> >
>> > > >> >> > code-gen after patch:
>> > > >> >> > f1:
>> > > >> >> > vceq.i8 d0, d0, #0
>> > > >> >> > vmvn d0, d0
>> > > >> >> > bx lr
>> > > >> >> >
>> > > >> >> > I am not sure which one is better tho ?
>> > > >> >>
>> > > >> >
>> > > >> > Hi Prathamesh,
>> > > >> >
>> > > >> > This patch introduces a regression on non-hardfp configs (eg arm-
>> > linux-gnueabi or arm-eabi):
>> > > >> > FAIL: gcc:gcc.target/arm/arm.exp=gcc.target/arm/pr51534.c scan-
>> > assembler-times vmov.i32[ \t]+[dD][0-9]+, #0xffffffff 3
>> > > >> > FAIL: gcc:gcc.target/arm/arm.exp=gcc.target/arm/pr51534.c scan-
>> > assembler-times vmov.i32[ \t]+[qQ][0-9]+, #4294967295 3
>> > > >> >
>> > > >> > Can you fix this?
>> > > >> The issue is, for following test:
>> > > >>
>> > > >> #include <arm_neon.h>
>> > > >>
>> > > >> uint8x8_t f1 (uint8x8_t a) {
>> > > >> return vcge_u8(a, vdup_n_u8(0));
>> > > >> }
>> > > >>
>> > > >> armhf code-gen:
>> > > >> f1:
>> > > >> vmov.i32 d0, #0xffffffff @ v8qi
>> > > >> bx lr
>> > > >>
>> > > >> arm softfp code-gen:
>> > > >> f1:
>> > > >> mov r0, #-1
>> > > >> mov r1, #-1
>> > > >> bx lr
>> > > >>
>> > > >> The code-gen for both is same upto split2 pass:
>> > > >>
>> > > >> (insn 10 6 11 2 (set (reg/i:V8QI 16 s0)
>> > > >> (const_vector:V8QI [
>> > > >> (const_int -1 [0xffffffffffffffff]) repeated x8
>> > > >> ])) "foo.c":5:1 1052 {*neon_movv8qi}
>> > > >> (expr_list:REG_EQUAL (const_vector:V8QI [
>> > > >> (const_int -1 [0xffffffffffffffff]) repeated x8
>> > > >> ])
>> > > >> (nil)))
>> > > >> (insn 11 10 13 2 (use (reg/i:V8QI 16 s0)) "foo.c":5:1 -1
>> > > >> (nil))
>> > > >>
>> > > >> and for softfp target, split2 pass splits the assignment to r0 and r1:
>> > > >>
>> > > >> (insn 15 6 16 2 (set (reg:SI 0 r0)
>> > > >> (const_int -1 [0xffffffffffffffff])) "foo.c":5:1 740
>> > {*thumb2_movsi_vfp}
>> > > >> (nil))
>> > > >> (insn 16 15 11 2 (set (reg:SI 1 r1 [+4 ])
>> > > >> (const_int -1 [0xffffffffffffffff])) "foo.c":5:1 740
>> > {*thumb2_movsi_vfp}
>> > > >> (nil))
>> > > >> (insn 11 16 13 2 (use (reg/i:V8QI 0 r0)) "foo.c":5:1 -1
>> > > >> (nil))
>> > > >>
>> > > >> I suppose we could use a dg-scan for r[0-9]+, #-1 for softfp targets ?
>> > > >>
>> > > > Yes, probably, or try with check-function-bodies.
>> > > Hi,
>> > > Sorry for the late response. Does the attached patch look OK ?
>> > ping https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577532.html
>>
>> Ok.
>
>
>
> Sorry Prathamesh, this does not quite work. See https://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/r12-3294-g7a6f40d0452ec76e126c2612dcfa32f3c73e2315/report-build-info.html
> (red cells in the gcc column)
>
> Can you have a look?
Ah, for arm-none-eabi target the code-gen for test_vcge_u8_uint8x8_t is:
mvn r0, #0
mvn r1, #0
bx lr
instead of:
mov r0, #-1
mov r1, #-1
bx lr
I guess both code-gen sequences are correct, but I am not sure under
which circumstance either sequence gets generated.
To accomodate for both, I tried to check for either pattern with OR:
/* { dg-final { scan-assembler-times "(mov\[ \]+r\[0-9\]+,
#-1)|(mvn\[ \]+r\[0-9\]+, #0)" 6 { target { arm_softfp_ok } } } }
*/
but that doesn't seem to work.
Do you have any suggestions on how to proceed ?
Thanks,
Prathamesh
>
> Thanks
>
> Christophe
>
>> Thanks,
>> Kyrill
>>
>> >
>> > Thanks,
>> > Prathamesh
>> > >
>> > > Thanks,
>> > > Prathamesh
>> > > >
>> > > > Christophe
>> > > >
>> > > >> Thanks,
>> > > >> Prathamesh
>> > > >> >
>> > > >> > Thanks
>> > > >> >
>> > > >> > Christophe
>> > > >> >
>> > > >> >
>> > > >> >>
>> > > >> >> I think they're equivalent in practice, in any case the patch itself is
>> > good (move away from RTL builtins).
>> > > >> >> Ok.
>> > > >> >> Thanks,
>> > > >> >> Kyrill
>> > > >> >>
>> > > >> >> >
>> > > >> >> > Also, this patch regressed bf16_dup.c on arm-linux-gnueabi,
>> > > >> >> > which is due to a missed opt in lowering. I had filed it as
>> > > >> >> > PR98435, and posted a fix for it here:
>> > > >> >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-
>> > June/572648.html
>> > > >> >> >
>> > > >> >> > Thanks,
>> > > >> >> > Prathamesh
next prev parent reply other threads:[~2021-09-03 8:36 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-24 11:11 Prathamesh Kulkarni
2021-06-24 16:28 ` Kyrylo Tkachov
2021-08-11 16:53 ` Christophe Lyon
2021-08-12 11:54 ` Prathamesh Kulkarni
2021-08-12 13:34 ` Christophe Lyon
2021-08-17 6:25 ` Prathamesh Kulkarni
2021-08-24 8:00 ` Prathamesh Kulkarni
2021-08-24 8:17 ` Kyrylo Tkachov
2021-09-02 9:02 ` Christophe Lyon
2021-09-03 8:35 ` Prathamesh Kulkarni [this message]
2021-09-03 10:58 ` Christophe LYON
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAAgBjMkBru_aMTwZv_Be89x6NV0JM0+tAYz-uD9pgwOXwF+g1w@mail.gmail.com \
--to=prathamesh.kulkarni@linaro.org \
--cc=Kyrylo.Tkachov@arm.com \
--cc=christophe.lyon.oss@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).