* [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode @ 2018-07-23 10:39 Sam Tebbs 2018-07-23 16:01 ` Sudakshina Das 0 siblings, 1 reply; 9+ messages in thread From: Sam Tebbs @ 2018-07-23 10:39 UTC (permalink / raw) To: gcc-patches, Marcus Shawcroft; +Cc: nd, Richard Earnshaw, James Greenhalgh [-- Attachment #1: Type: text/plain, Size: 1220 bytes --] Hi all, This patch extends the aarch64_get_lane_zero_extendsi instruction definition to also cover DI mode. This prevents a redundant AND instruction from being generated due to the pattern failing to be matched. Example: typedef char v16qi __attribute__ ((vector_size (16))); unsigned long long foo (v16qi a) {  return a[0]; } Previously generated: foo:        umov   w0, v0.b[0]        and    x0, x0, 255        ret And now generates: foo:        umov   w0, v0.b[0]        ret Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf with no regressions. gcc/ 2018-07-23 Sam Tebbs <sam.tebbs@arm.com>        * config/aarch64/aarch64-simd.md    (*aarch64_get_lane_zero_extendsi<mode>):        Rename to... (*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This.        Use GPI iterator instead of SI mode. gcc/testsuite 2018-07-23 Sam Tebbs <sam.tebbs@arm.com>        * gcc.target/aarch64/extract_zero_extend.c: New file [-- Attachment #2: 23-07-2018--11-27-30.patch --] [-- Type: text/x-patch, Size: 1492 bytes --] diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 89e38e6..15fb661 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3032,15 +3032,16 @@ [(set_attr "type" "neon_to_gp<q>")] ) -(define_insn "*aarch64_get_lane_zero_extendsi<mode>" - [(set (match_operand:SI 0 "register_operand" "=r") - (zero_extend:SI +(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>" + [(set (match_operand:GPI 0 "register_operand" "=r") + (zero_extend:GPI (vec_select:<VEL> (match_operand:VDQQH 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] "TARGET_SIMD" { - operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2])); + operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode, + INTVAL (operands[2])); return "umov\\t%w0, %1.<Vetype>[%2]"; } [(set_attr "type" "neon_to_gp<q>")] diff --git a/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c new file mode 100644 index 0000000..40d307a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +typedef char v16qi __attribute__ ((vector_size (16))); + +unsigned long long +foo (v16qi a) +{ + /* { dg-final { scan-assembler "umov\\t" } } */ + /* { dg-final { scan-assembler-not "and\\t" } } */ + return a[0]; +} + ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode 2018-07-23 10:39 [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode Sam Tebbs @ 2018-07-23 16:01 ` Sudakshina Das 2018-07-25 13:09 ` Sam Tebbs 0 siblings, 1 reply; 9+ messages in thread From: Sudakshina Das @ 2018-07-23 16:01 UTC (permalink / raw) To: Sam Tebbs, gcc-patches, Marcus Shawcroft Cc: nd, Richard Earnshaw, James Greenhalgh Hi Sam On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote: > Hi all, > > This patch extends the aarch64_get_lane_zero_extendsi instruction > definition to > also cover DI mode. This prevents a redundant AND instruction from being > generated due to the pattern failing to be matched. > > Example: > > typedef char v16qi __attribute__ ((vector_size (16))); > > unsigned long long > foo (v16qi a) > { >  return a[0]; > } > > Previously generated: > > foo: >        umov   w0, v0.b[0] >        and    x0, x0, 255 >        ret > > And now generates: > > foo: >        umov   w0, v0.b[0] >        ret > > Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf > with no > regressions. > > gcc/ > 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> > >        * config/aarch64/aarch64-simd.md >    (*aarch64_get_lane_zero_extendsi<mode>): >        Rename to... > (*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This. >        Use GPI iterator instead of SI mode. > > gcc/testsuite > 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> > >        * gcc.target/aarch64/extract_zero_extend.c: New file > You will need an approval from a maintainer, but I would only add one request to this: diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 89e38e6..15fb661 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3032,15 +3032,16 @@   [(set_attr "type" "neon_to_gp<q>")]  ) -(define_insn "*aarch64_get_lane_zero_extendsi<mode>" - [(set (match_operand:SI 0 "register_operand" "=r") -   (zero_extend:SI +(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>" + [(set (match_operand:GPI 0 "register_operand" "=r") +   (zero_extend:GPI Since you are adding 4 new patterns with this change, could you add more cases in your test as well to make sure you have coverage for each of them. Thanks Sudi      (vec_select:<VEL>        (match_operand:VDQQH 1 "register_operand" "w")        (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]   "TARGET_SIMD"   { -   operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2])); +   operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode, +                 INTVAL (operands[2]));     return "umov\\t%w0, %1.<Vetype>[%2]";   }   [(set_attr "type" "neon_to_gp<q>")] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode 2018-07-23 16:01 ` Sudakshina Das @ 2018-07-25 13:09 ` Sam Tebbs 2018-07-25 18:08 ` Sudakshina Das 2018-07-27 12:39 ` Sudakshina Das 0 siblings, 2 replies; 9+ messages in thread From: Sam Tebbs @ 2018-07-25 13:09 UTC (permalink / raw) To: Sudakshina Das, gcc-patches Cc: Marcus Shawcroft, nd, Richard Earnshaw, James Greenhalgh [-- Attachment #1: Type: text/plain, Size: 3180 bytes --] On 07/23/2018 05:01 PM, Sudakshina Das wrote: > Hi Sam > > > On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote: >> Hi all, >> >> This patch extends the aarch64_get_lane_zero_extendsi instruction >> definition to >> also cover DI mode. This prevents a redundant AND instruction from being >> generated due to the pattern failing to be matched. >> >> Example: >> >> typedef char v16qi __attribute__ ((vector_size (16))); >> >> unsigned long long >> foo (v16qi a) >> { >>  return a[0]; >> } >> >> Previously generated: >> >> foo: >>        umov   w0, v0.b[0] >>        and    x0, x0, 255 >>        ret >> >> And now generates: >> >> foo: >>        umov   w0, v0.b[0] >>        ret >> >> Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf >> with no >> regressions. >> >> gcc/ >> 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> >> >>        * config/aarch64/aarch64-simd.md >>    (*aarch64_get_lane_zero_extendsi<mode>): >>        Rename to... >> (*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This. >>        Use GPI iterator instead of SI mode. >> >> gcc/testsuite >> 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> >> >>        * gcc.target/aarch64/extract_zero_extend.c: New file >> > You will need an approval from a maintainer, but I would only add one > request to this: > > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index 89e38e6..15fb661 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -3032,15 +3032,16 @@ >   [(set_attr "type" "neon_to_gp<q>")] >  ) > > -(define_insn "*aarch64_get_lane_zero_extendsi<mode>" > - [(set (match_operand:SI 0 "register_operand" "=r") > -   (zero_extend:SI > +(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>" > + [(set (match_operand:GPI 0 "register_operand" "=r") > +   (zero_extend:GPI > > Since you are adding 4 new patterns with this change, could you add > more cases in your test as well to make sure you have coverage for > each of them. > > Thanks > Sudi Hi Sudi, Thanks for the feedback. Here is an updated patch that adds more testcases to cover the patterns generated by the different mode combinations. The changelog and description from my original email still apply. > >      (vec_select:<VEL> >        (match_operand:VDQQH 1 "register_operand" "w") >        (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] >   "TARGET_SIMD" >   { > -   operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL > (operands[2])); > +   operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode, > +                 INTVAL (operands[2])); >     return "umov\\t%w0, %1.<Vetype>[%2]"; >   } >   [(set_attr "type" "neon_to_gp<q>")] [-- Attachment #2: latest.patch --] [-- Type: text/x-patch, Size: 2222 bytes --] diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index f1784d72e55c412d076de43f2f7aad4632d55ecb..e92a3b49c65e84d2a16a2a480c359a0b4d8fa3e3 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3033,15 +3033,16 @@ [(set_attr "type" "neon_to_gp<q>")] ) -(define_insn "*aarch64_get_lane_zero_extendsi<mode>" - [(set (match_operand:SI 0 "register_operand" "=r") - (zero_extend:SI +(define_insn "*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>" + [(set (match_operand:GPI 0 "register_operand" "=r") + (zero_extend:GPI (vec_select:<VEL> (match_operand:VDQQH 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] "TARGET_SIMD" { - operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2])); + operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode, + INTVAL (operands[2])); return "umov\\t%w0, %1.<Vetype>[%2]"; } [(set_attr "type" "neon_to_gp<q>")] diff --git a/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c new file mode 100644 index 0000000000000000000000000000000000000000..deb613cd23150a83dfd36ae84504415993b97be3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c @@ -0,0 +1,39 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +/* Tests *aarch64_get_lane_zero_extenddiv16qi. */ +typedef unsigned char v16qi __attribute__ ((vector_size (16))); +/* Tests *aarch64_get_lane_zero_extenddiv8qi. */ +typedef unsigned char v8qi __attribute__ ((vector_size (8))); + +/* Tests *aarch64_get_lane_zero_extendsiv8hi. */ +typedef unsigned short v16hi __attribute__ ((vector_size (16))); +/* Tests *aarch64_get_lane_zero_extendsiv4hi. */ +typedef unsigned short v8hi __attribute__ ((vector_size (8))); + +unsigned long long +foo_16qi (v16qi a) +{ + return a[0]; +} + +unsigned long long +foo_8qi (v8qi a) +{ + return a[0]; +} + +unsigned int +foo_16hi (v16hi a) +{ + return a[0]; +} + +unsigned int +foo_8hi (v8hi a) +{ + return a[0]; +} + +/* { dg-final { scan-assembler-times "umov\\t" 4 } } */ +/* { dg-final { scan-assembler-not "and\\t" } } */ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode 2018-07-25 13:09 ` Sam Tebbs @ 2018-07-25 18:08 ` Sudakshina Das 2018-07-26 16:52 ` Sam Tebbs 2018-07-27 12:39 ` Sudakshina Das 1 sibling, 1 reply; 9+ messages in thread From: Sudakshina Das @ 2018-07-25 18:08 UTC (permalink / raw) To: Sam Tebbs, gcc-patches Cc: Marcus Shawcroft, nd, Richard Earnshaw, James Greenhalgh Hi Sam On 25/07/18 14:08, Sam Tebbs wrote: > On 07/23/2018 05:01 PM, Sudakshina Das wrote: >> Hi Sam >> >> >> On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote: >>> Hi all, >>> >>> This patch extends the aarch64_get_lane_zero_extendsi instruction >>> definition to >>> also cover DI mode. This prevents a redundant AND instruction from being >>> generated due to the pattern failing to be matched. >>> >>> Example: >>> >>> typedef char v16qi __attribute__ ((vector_size (16))); >>> >>> unsigned long long >>> foo (v16qi a) >>> { >>>  return a[0]; >>> } >>> >>> Previously generated: >>> >>> foo: >>>        umov   w0, v0.b[0] >>>        and    x0, x0, 255 >>>        ret >>> >>> And now generates: >>> >>> foo: >>>        umov   w0, v0.b[0] >>>        ret >>> >>> Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf >>> with no >>> regressions. >>> >>> gcc/ >>> 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> >>> >>>        * config/aarch64/aarch64-simd.md >>>    (*aarch64_get_lane_zero_extendsi<mode>): >>>        Rename to... >>> (*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This. >>>        Use GPI iterator instead of SI mode. >>> >>> gcc/testsuite >>> 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> >>> >>>        * gcc.target/aarch64/extract_zero_extend.c: New file >>> >> You will need an approval from a maintainer, but I would only add one >> request to this: >> >> diff --git a/gcc/config/aarch64/aarch64-simd.md >> b/gcc/config/aarch64/aarch64-simd.md >> index 89e38e6..15fb661 100644 >> --- a/gcc/config/aarch64/aarch64-simd.md >> +++ b/gcc/config/aarch64/aarch64-simd.md >> @@ -3032,15 +3032,16 @@ >>   [(set_attr "type" "neon_to_gp<q>")] >>  ) >> >> -(define_insn "*aarch64_get_lane_zero_extendsi<mode>" >> - [(set (match_operand:SI 0 "register_operand" "=r") >> -   (zero_extend:SI >> +(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>" >> + [(set (match_operand:GPI 0 "register_operand" "=r") >> +   (zero_extend:GPI >> >> Since you are adding 4 new patterns with this change, could you add >> more cases in your test as well to make sure you have coverage for >> each of them. >> >> Thanks >> Sudi > > Hi Sudi, > > Thanks for the feedback. Here is an updated patch that adds more > testcases to cover the patterns generated by the different mode > combinations. The changelog and description from my original email still > apply. > Thanks for making the changes and adding more test cases. I do however see that you are only covering 2 out of 4 new *aarch64_get_lane_zero_extenddi<> patterns. The *aarch64_get_lane_zero_extendsi<> were already existing. I don't mind those tests. I would just ask you to add the other two new patterns as well. Also since the different versions of the instruction generate same instructions (like foo_16qi and foo_8qi both give out the same instruction), I would suggest using a -fdump-rtl-final (or any relevant rtl dump) with the dg-options and using a scan-rtl-dump to scan the pattern name. Something like: /* { dg-do compile } */ /* { dg-options "-O3 -fdump-rtl-final" } */ ... ... /* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" "final" } } */ Thanks Sudi >> >>      (vec_select:<VEL> >>        (match_operand:VDQQH 1 "register_operand" "w") >>        (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] >>   "TARGET_SIMD" >>   { >> -   operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL >> (operands[2])); >> +   operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode, >> +                 INTVAL (operands[2])); >>     return "umov\\t%w0, %1.<Vetype>[%2]"; >>   } >>   [(set_attr "type" "neon_to_gp<q>")] > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode 2018-07-25 18:08 ` Sudakshina Das @ 2018-07-26 16:52 ` Sam Tebbs 2018-07-31 22:16 ` James Greenhalgh 0 siblings, 1 reply; 9+ messages in thread From: Sam Tebbs @ 2018-07-26 16:52 UTC (permalink / raw) To: Sudakshina Das, gcc-patches Cc: Marcus Shawcroft, nd, Richard Earnshaw, James Greenhalgh [-- Attachment #1: Type: text/plain, Size: 4978 bytes --] On 07/25/2018 07:08 PM, Sudakshina Das wrote: > Hi Sam > > On 25/07/18 14:08, Sam Tebbs wrote: >> On 07/23/2018 05:01 PM, Sudakshina Das wrote: >>> Hi Sam >>> >>> >>> On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote: >>>> Hi all, >>>> >>>> This patch extends the aarch64_get_lane_zero_extendsi instruction >>>> definition to >>>> also cover DI mode. This prevents a redundant AND instruction from >>>> being >>>> generated due to the pattern failing to be matched. >>>> >>>> Example: >>>> >>>> typedef char v16qi __attribute__ ((vector_size (16))); >>>> >>>> unsigned long long >>>> foo (v16qi a) >>>> { >>>>  return a[0]; >>>> } >>>> >>>> Previously generated: >>>> >>>> foo: >>>>        umov   w0, v0.b[0] >>>>        and    x0, x0, 255 >>>>        ret >>>> >>>> And now generates: >>>> >>>> foo: >>>>        umov   w0, v0.b[0] >>>>        ret >>>> >>>> Bootstrapped on aarch64-none-linux-gnu and tested on >>>> aarch64-none-elf with no >>>> regressions. >>>> >>>> gcc/ >>>> 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> >>>> >>>>        * config/aarch64/aarch64-simd.md >>>>    (*aarch64_get_lane_zero_extendsi<mode>): >>>>        Rename to... >>>> (*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This. >>>>        Use GPI iterator instead of SI mode. >>>> >>>> gcc/testsuite >>>> 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> >>>> >>>>        * gcc.target/aarch64/extract_zero_extend.c: New file >>>> >>> You will need an approval from a maintainer, but I would only add >>> one request to this: >>> >>> diff --git a/gcc/config/aarch64/aarch64-simd.md >>> b/gcc/config/aarch64/aarch64-simd.md >>> index 89e38e6..15fb661 100644 >>> --- a/gcc/config/aarch64/aarch64-simd.md >>> +++ b/gcc/config/aarch64/aarch64-simd.md >>> @@ -3032,15 +3032,16 @@ >>>   [(set_attr "type" "neon_to_gp<q>")] >>>  ) >>> >>> -(define_insn "*aarch64_get_lane_zero_extendsi<mode>" >>> - [(set (match_operand:SI 0 "register_operand" "=r") >>> -   (zero_extend:SI >>> +(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>" >>> + [(set (match_operand:GPI 0 "register_operand" "=r") >>> +   (zero_extend:GPI >>> >>> Since you are adding 4 new patterns with this change, could you add >>> more cases in your test as well to make sure you have coverage for >>> each of them. >>> >>> Thanks >>> Sudi >> >> Hi Sudi, >> >> Thanks for the feedback. Here is an updated patch that adds more >> testcases to cover the patterns generated by the different mode >> combinations. The changelog and description from my original email >> still apply. >> > > Thanks for making the changes and adding more test cases. I do however > see that you are only covering 2 out of 4 new > *aarch64_get_lane_zero_extenddi<> patterns. The > *aarch64_get_lane_zero_extendsi<> were already existing. I don't mind > those tests. I would just ask you to add the other two new patterns > as well. Also since the different versions of the instruction generate > same instructions (like foo_16qi and foo_8qi both give out the same > instruction), I would suggest using a -fdump-rtl-final (or any relevant > rtl dump) with the dg-options and using a scan-rtl-dump to scan the > pattern name. Something like: > /* { dg-do compile } */ > /* { dg-options "-O3 -fdump-rtl-final" } */ > ... > ... > /* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" > "final" } } */ > > Thanks > Sudi Hi Sudi, Thanks again. Here's an update that adds 4 more tests, so all 8 patterns generated are now tested for! Below is the updated changelog gcc/ 2018-07-26 Sam Tebbs <sam.tebbs@arm.com>        * config/aarch64/aarch64-simd.md        (*aarch64_get_lane_zero_extendsi<mode>):        Rename to... (*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>): ... This.        Use GPI iterator instead of SI mode. gcc/testsuite 2018-07-26 Sam Tebbs <sam.tebbs@arm.com>        * gcc.target/aarch64/extract_zero_extend.c: New file > >>> >>>      (vec_select:<VEL> >>>        (match_operand:VDQQH 1 "register_operand" "w") >>>        (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] >>>   "TARGET_SIMD" >>>   { >>> -   operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL >>> (operands[2])); >>> +   operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode, >>> +                 INTVAL (operands[2])); >>>     return "umov\\t%w0, %1.<Vetype>[%2]"; >>>   } >>>   [(set_attr "type" "neon_to_gp<q>")] >> > [-- Attachment #2: latest.patch --] [-- Type: text/x-patch, Size: 3430 bytes --] diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index f1784d72e55c412d076de43f2f7aad4632d55ecb..e92a3b49c65e84d2a16a2a480c359a0b4d8fa3e3 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3033,15 +3033,16 @@ [(set_attr "type" "neon_to_gp<q>")] ) -(define_insn "*aarch64_get_lane_zero_extendsi<mode>" - [(set (match_operand:SI 0 "register_operand" "=r") - (zero_extend:SI +(define_insn "*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>" + [(set (match_operand:GPI 0 "register_operand" "=r") + (zero_extend:GPI (vec_select:<VEL> (match_operand:VDQQH 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] "TARGET_SIMD" { - operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2])); + operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode, + INTVAL (operands[2])); return "umov\\t%w0, %1.<Vetype>[%2]"; } [(set_attr "type" "neon_to_gp<q>")] diff --git a/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c new file mode 100644 index 0000000000000000000000000000000000000000..a294b261909a1d67ab339c929f2609dcda01c067 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c @@ -0,0 +1,81 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-rtl-final" } */ + +/* Tests div16qi. */ +typedef unsigned char div16qi __attribute__ ((vector_size (16))); +/* Tests div8qi. */ +typedef unsigned char div8qi __attribute__ ((vector_size (8))); +/* Tests div8hi. */ +typedef unsigned short div8hi __attribute__ ((vector_size (16))); +/* Tests div4hi. */ +typedef unsigned short div4hi __attribute__ ((vector_size (8))); + +/* Tests siv16qi. */ +typedef unsigned char siv16qi __attribute__ ((vector_size (16))); +/* Tests siv8qi. */ +typedef unsigned char siv8qi __attribute__ ((vector_size (8))); +/* Tests siv8hi. */ +typedef unsigned short siv8hi __attribute__ ((vector_size (16))); +/* Tests siv4hi. */ +typedef unsigned short siv4hi __attribute__ ((vector_size (8))); + + +unsigned long long +foo_div16qi (div16qi a) +{ + return a[0]; +} + +unsigned long long +foo_div8qi (div8qi a) +{ + return a[0]; +} + +unsigned long long +foo_div8hi (div8hi a) +{ + return a[0]; +} + +unsigned long long +foo_div4hi (div4hi a) +{ + return a[0]; +} + +unsigned int +foo_siv16qi (siv16qi a) +{ + return a[0]; +} + +unsigned int +foo_siv8qi (siv8qi a) +{ + return a[0]; +} + +unsigned int +foo_siv8hi (siv8hi a) +{ + return a[0]; +} + +unsigned int +foo_siv4hi (siv4hi a) +{ + return a[0]; +} + +/* { dg-final { scan-assembler-times "umov\\t" 8 } } */ +/* { dg-final { scan-assembler-not "and\\t" } } */ + +/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" "final" } } */ +/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv8qi" "final" } } */ +/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv8hi" "final" } } */ +/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv4hi" "final" } } */ +/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extendsiv16qi" "final" } } */ +/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extendsiv8qi" "final" } } */ +/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extendsiv8hi" "final" } } */ +/* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extendsiv4hi" "final" } } */ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode 2018-07-26 16:52 ` Sam Tebbs @ 2018-07-31 22:16 ` James Greenhalgh 2018-08-01 9:13 ` Sam Tebbs 0 siblings, 1 reply; 9+ messages in thread From: James Greenhalgh @ 2018-07-31 22:16 UTC (permalink / raw) To: Sam Tebbs Cc: Sudakshina Das, gcc-patches, Marcus Shawcroft, nd, Richard Earnshaw On Thu, Jul 26, 2018 at 11:52:15AM -0500, Sam Tebbs wrote: <snip> > > Thanks for making the changes and adding more test cases. I do however > > see that you are only covering 2 out of 4 new > > *aarch64_get_lane_zero_extenddi<> patterns. The > > *aarch64_get_lane_zero_extendsi<> were already existing. I don't mind > > those tests. I would just ask you to add the other two new patterns > > as well. Also since the different versions of the instruction generate > > same instructions (like foo_16qi and foo_8qi both give out the same > > instruction), I would suggest using a -fdump-rtl-final (or any relevant > > rtl dump) with the dg-options and using a scan-rtl-dump to scan the > > pattern name. Something like: > > /* { dg-do compile } */ > > /* { dg-options "-O3 -fdump-rtl-final" } */ > > ... > > ... > > /* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" > > "final" } } */ > > > > Thanks > > Sudi > > Hi Sudi, > > Thanks again. Here's an update that adds 4 more tests, so all 8 patterns > generated are now tested for! This is OK for trunk, thanks for the patch (and thanks Sudi for the review!) Thanks, James > > Below is the updated changelog > > gcc/ > 2018-07-26 Sam Tebbs <sam.tebbs@arm.com> > >        * config/aarch64/aarch64-simd.md >        (*aarch64_get_lane_zero_extendsi<mode>): >        Rename to... > (*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>): ... This. >        Use GPI iterator instead of SI mode. > > gcc/testsuite > 2018-07-26 Sam Tebbs <sam.tebbs@arm.com> > >        * gcc.target/aarch64/extract_zero_extend.c: New file > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode 2018-07-31 22:16 ` James Greenhalgh @ 2018-08-01 9:13 ` Sam Tebbs 2018-08-01 10:20 ` Sudakshina Das 0 siblings, 1 reply; 9+ messages in thread From: Sam Tebbs @ 2018-08-01 9:13 UTC (permalink / raw) To: gcc-patches Cc: James Greenhalgh, Sudakshina Das, Marcus Shawcroft, nd, Richard Earnshaw On 07/31/2018 11:16 PM, James Greenhalgh wrote: > On Thu, Jul 26, 2018 at 11:52:15AM -0500, Sam Tebbs wrote: > > <snip> > >>> Thanks for making the changes and adding more test cases. I do however >>> see that you are only covering 2 out of 4 new >>> *aarch64_get_lane_zero_extenddi<> patterns. The >>> *aarch64_get_lane_zero_extendsi<> were already existing. I don't mind >>> those tests. I would just ask you to add the other two new patterns >>> as well. Also since the different versions of the instruction generate >>> same instructions (like foo_16qi and foo_8qi both give out the same >>> instruction), I would suggest using a -fdump-rtl-final (or any relevant >>> rtl dump) with the dg-options and using a scan-rtl-dump to scan the >>> pattern name. Something like: >>> /* { dg-do compile } */ >>> /* { dg-options "-O3 -fdump-rtl-final" } */ >>> ... >>> ... >>> /* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" >>> "final" } } */ >>> >>> Thanks >>> Sudi >> Hi Sudi, >> >> Thanks again. Here's an update that adds 4 more tests, so all 8 patterns >> generated are now tested for! > This is OK for trunk, thanks for the patch (and thanks Sudi for the review!) > > Thanks, > James Thank you James! I'd appreciate it if someone could commit it as I don't have commit rights yet. Sam > >> Below is the updated changelog >> >> gcc/ >> 2018-07-26 Sam Tebbs <sam.tebbs@arm.com> >> >>        * config/aarch64/aarch64-simd.md >>        (*aarch64_get_lane_zero_extendsi<mode>): >>        Rename to... >> (*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>): ... This. >>        Use GPI iterator instead of SI mode. >> >> gcc/testsuite >> 2018-07-26 Sam Tebbs <sam.tebbs@arm.com> >> >>        * gcc.target/aarch64/extract_zero_extend.c: New file >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode 2018-08-01 9:13 ` Sam Tebbs @ 2018-08-01 10:20 ` Sudakshina Das 0 siblings, 0 replies; 9+ messages in thread From: Sudakshina Das @ 2018-08-01 10:20 UTC (permalink / raw) To: Sam Tebbs, gcc-patches Cc: James Greenhalgh, Marcus Shawcroft, nd, Richard Earnshaw Hi Sam On 01/08/18 10:12, Sam Tebbs wrote: > > > On 07/31/2018 11:16 PM, James Greenhalgh wrote: >> On Thu, Jul 26, 2018 at 11:52:15AM -0500, Sam Tebbs wrote: >> >> <snip> >> >>>> Thanks for making the changes and adding more test cases. I do however >>>> see that you are only covering 2 out of 4 new >>>> *aarch64_get_lane_zero_extenddi<> patterns. The >>>> *aarch64_get_lane_zero_extendsi<> were already existing. I don't mind >>>> those tests. I would just ask you to add the other two new patterns >>>> as well. Also since the different versions of the instruction generate >>>> same instructions (like foo_16qi and foo_8qi both give out the same >>>> instruction), I would suggest using a -fdump-rtl-final (or any relevant >>>> rtl dump) with the dg-options and using a scan-rtl-dump to scan the >>>> pattern name. Something like: >>>> /* { dg-do compile } */ >>>> /* { dg-options "-O3 -fdump-rtl-final" } */ >>>> ... >>>> ... >>>> /* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" >>>> "final" } } */ >>>> >>>> Thanks >>>> Sudi >>> Hi Sudi, >>> >>> Thanks again. Here's an update that adds 4 more tests, so all 8 patterns >>> generated are now tested for! >> This is OK for trunk, thanks for the patch (and thanks Sudi for the >> review!) >> >> Thanks, >> James > > Thank you James! I'd appreciate it if someone could commit it as I don't > have commit rights yet. > I have committed this on your behalf as r263200. Thanks Sudi > Sam > >> >>> Below is the updated changelog >>> >>> gcc/ >>> 2018-07-26 Sam Tebbs <sam.tebbs@arm.com> >>> >>>         * config/aarch64/aarch64-simd.md >>>         (*aarch64_get_lane_zero_extendsi<mode>): >>>         Rename to... >>> (*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>): ... This. >>>         Use GPI iterator instead of SI mode. >>> >>> gcc/testsuite >>> 2018-07-26 Sam Tebbs <sam.tebbs@arm.com> >>> >>>         * gcc.target/aarch64/extract_zero_extend.c: New file >>> > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode 2018-07-25 13:09 ` Sam Tebbs 2018-07-25 18:08 ` Sudakshina Das @ 2018-07-27 12:39 ` Sudakshina Das 1 sibling, 0 replies; 9+ messages in thread From: Sudakshina Das @ 2018-07-27 12:39 UTC (permalink / raw) To: Sam Tebbs, gcc-patches Cc: Marcus Shawcroft, nd, Richard Earnshaw, James Greenhalgh Hi Sam On 25/07/18 14:08, Sam Tebbs wrote: > On 07/23/2018 05:01 PM, Sudakshina Das wrote: >> Hi Sam >> >> >> On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote: >>> Hi all, >>> >>> This patch extends the aarch64_get_lane_zero_extendsi instruction >>> definition to >>> also cover DI mode. This prevents a redundant AND instruction from being >>> generated due to the pattern failing to be matched. >>> >>> Example: >>> >>> typedef char v16qi __attribute__ ((vector_size (16))); >>> >>> unsigned long long >>> foo (v16qi a) >>> { >>>  return a[0]; >>> } >>> >>> Previously generated: >>> >>> foo: >>>        umov   w0, v0.b[0] >>>        and    x0, x0, 255 >>>        ret >>> >>> And now generates: >>> >>> foo: >>>        umov   w0, v0.b[0] >>>        ret >>> >>> Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf >>> with no >>> regressions. >>> >>> gcc/ >>> 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> >>> >>>        * config/aarch64/aarch64-simd.md >>>    (*aarch64_get_lane_zero_extendsi<mode>): >>>        Rename to... >>> (*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This. >>>        Use GPI iterator instead of SI mode. >>> >>> gcc/testsuite >>> 2018-07-23 Sam Tebbs <sam.tebbs@arm.com> >>> >>>        * gcc.target/aarch64/extract_zero_extend.c: New file >>> >> You will need an approval from a maintainer, but I would only add one >> request to this: >> >> diff --git a/gcc/config/aarch64/aarch64-simd.md >> b/gcc/config/aarch64/aarch64-simd.md >> index 89e38e6..15fb661 100644 >> --- a/gcc/config/aarch64/aarch64-simd.md >> +++ b/gcc/config/aarch64/aarch64-simd.md >> @@ -3032,15 +3032,16 @@ >>   [(set_attr "type" "neon_to_gp<q>")] >>  ) >> >> -(define_insn "*aarch64_get_lane_zero_extendsi<mode>" >> - [(set (match_operand:SI 0 "register_operand" "=r") >> -   (zero_extend:SI >> +(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>" >> + [(set (match_operand:GPI 0 "register_operand" "=r") >> +   (zero_extend:GPI >> >> Since you are adding 4 new patterns with this change, could you add >> more cases in your test as well to make sure you have coverage for >> each of them. >> >> Thanks >> Sudi > > Hi Sudi, > > Thanks for the feedback. Here is an updated patch that adds more > testcases to cover the patterns generated by the different mode > combinations. The changelog and description from my original email still > apply. > Thanks it looks good to me! You will still need a maintainer to approve. Sudi >> >>      (vec_select:<VEL> >>        (match_operand:VDQQH 1 "register_operand" "w") >>        (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] >>   "TARGET_SIMD" >>   { >> -   operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL >> (operands[2])); >> +   operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode, >> +                 INTVAL (operands[2])); >>     return "umov\\t%w0, %1.<Vetype>[%2]"; >>   } >>   [(set_attr "type" "neon_to_gp<q>")] > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-08-01 10:20 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-23 10:39 [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode Sam Tebbs 2018-07-23 16:01 ` Sudakshina Das 2018-07-25 13:09 ` Sam Tebbs 2018-07-25 18:08 ` Sudakshina Das 2018-07-26 16:52 ` Sam Tebbs 2018-07-31 22:16 ` James Greenhalgh 2018-08-01 9:13 ` Sam Tebbs 2018-08-01 10:20 ` Sudakshina Das 2018-07-27 12:39 ` Sudakshina Das
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).