On 07/25/2018 07:08 PM, Sudakshina Das wrote: > Hi Sam > > On 25/07/18 14:08, Sam Tebbs wrote: >> On 07/23/2018 05:01 PM, Sudakshina Das wrote: >>> Hi Sam >>> >>> >>> On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote: >>>> Hi all, >>>> >>>> This patch extends the aarch64_get_lane_zero_extendsi instruction >>>> definition to >>>> also cover DI mode. This prevents a redundant AND instruction from >>>> being >>>> generated due to the pattern failing to be matched. >>>> >>>> Example: >>>> >>>> typedef char v16qi __attribute__ ((vector_size (16))); >>>> >>>> unsigned long long >>>> foo (v16qi a) >>>> { >>>>   return a[0]; >>>> } >>>> >>>> Previously generated: >>>> >>>> foo: >>>>         umov    w0, v0.b[0] >>>>         and     x0, x0, 255 >>>>         ret >>>> >>>> And now generates: >>>> >>>> foo: >>>>         umov    w0, v0.b[0] >>>>         ret >>>> >>>> Bootstrapped on aarch64-none-linux-gnu and tested on >>>> aarch64-none-elf with no >>>> regressions. >>>> >>>> gcc/ >>>> 2018-07-23  Sam Tebbs >>>> >>>>         * config/aarch64/aarch64-simd.md >>>>     (*aarch64_get_lane_zero_extendsi): >>>>         Rename to... >>>> (*aarch64_get_lane_zero_extend): ... This. >>>>         Use GPI iterator instead of SI mode. >>>> >>>> gcc/testsuite >>>> 2018-07-23  Sam Tebbs >>>> >>>>         * gcc.target/aarch64/extract_zero_extend.c: New file >>>> >>> You will need an approval from a maintainer, but I would only add >>> one request to this: >>> >>> diff --git a/gcc/config/aarch64/aarch64-simd.md >>> b/gcc/config/aarch64/aarch64-simd.md >>> index 89e38e6..15fb661 100644 >>> --- a/gcc/config/aarch64/aarch64-simd.md >>> +++ b/gcc/config/aarch64/aarch64-simd.md >>> @@ -3032,15 +3032,16 @@ >>>    [(set_attr "type" "neon_to_gp")] >>>  ) >>> >>> -(define_insn "*aarch64_get_lane_zero_extendsi" >>> -  [(set (match_operand:SI 0 "register_operand" "=r") >>> -    (zero_extend:SI >>> +(define_insn "*aarch64_get_lane_zero_extend" >>> +  [(set (match_operand:GPI 0 "register_operand" "=r") >>> +    (zero_extend:GPI >>> >>> Since you are adding 4 new patterns with this change, could you add >>> more cases in your test as well to make sure you have coverage for >>> each of them. >>> >>> Thanks >>> Sudi >> >> Hi Sudi, >> >> Thanks for the feedback. Here is an updated patch that adds more >> testcases to cover the patterns generated by the different mode >> combinations. The changelog and description from my original email >> still apply. >> > > Thanks for making the changes and adding more test cases. I do however > see that you are only covering 2 out of 4 new > *aarch64_get_lane_zero_extenddi<> patterns. The > *aarch64_get_lane_zero_extendsi<> were already existing. I don't mind > those tests. I would just ask you to add the other two new patterns > as well. Also since the different versions of the instruction generate > same instructions (like foo_16qi and foo_8qi both give out the same > instruction), I would suggest using a -fdump-rtl-final (or any relevant > rtl dump) with the dg-options and using a scan-rtl-dump to scan the > pattern name. Something like: > /* { dg-do compile } */ > /* { dg-options "-O3 -fdump-rtl-final" } */ > ... > ... > /* { dg-final { scan-rtl-dump "aarch64_get_lane_zero_extenddiv16qi" > "final" } } */ > > Thanks > Sudi Hi Sudi, Thanks again. Here's an update that adds 4 more tests, so all 8 patterns generated are now tested for! Below is the updated changelog gcc/ 2018-07-26  Sam Tebbs          * config/aarch64/aarch64-simd.md         (*aarch64_get_lane_zero_extendsi):         Rename to... (*aarch64_get_lane_zero_extend): ... This.         Use GPI iterator instead of SI mode. gcc/testsuite 2018-07-26  Sam Tebbs          * gcc.target/aarch64/extract_zero_extend.c: New file > >>> >>>        (vec_select: >>>          (match_operand:VDQQH 1 "register_operand" "w") >>>          (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] >>>    "TARGET_SIMD" >>>    { >>> -    operands[2] = aarch64_endian_lane_rtx (mode, INTVAL >>> (operands[2])); >>> +    operands[2] = aarch64_endian_lane_rtx (mode, >>> +                       INTVAL (operands[2])); >>>      return "umov\\t%w0, %1.[%2]"; >>>    } >>>    [(set_attr "type" "neon_to_gp")] >> >