Hi Sudi, Thanks for noticing that, I have attached an improved patch file that fixes this issue. Below is an updated description and changelog: This patch adds an optimisation that exploits the AArch64 BFXIL instruction when or-ing the result of two bitwise and operations with non-overlapping bitmasks (e.g. (a & 0xFFFF0000) | (b & 0x0000FFFF)). Example: unsigned long long combine(unsigned long long a, unsigned long long b) {   return (a & 0xffffffff00000000ll) | (b & 0x00000000ffffffffll); } void read(unsigned long long a, unsigned long long b, unsigned long long *c) {   *c = combine(a, b); } When compiled with -O2, read would result in: read:   and   x5, x1, #0xffffffff   and   x4, x0, #0xffffffff00000000   orr   x4, x4, x5   str   x4, [x2]   ret But with this patch results in: read:   mov    x4, x0   bfxil    x4, x1, 0, 32   str    x4, [x2]   ret Bootstrapped and regtested on aarch64-none-linux-gnu and aarch64-none-elf with no regressions. gcc/ 2018-07-11  Sam Tebbs          * config/aarch64/aarch64.md (*aarch64_bfxil, *aarch64_bfxil_alt):         Define.         * config/aarch64/aarch64-protos.h (aarch64_is_left_consecutive):         Define.         * config/aarch64/aarch64.c (aarch64_is_left_consecutive): New function. gcc/testsuite 2018-07-11  Sam Tebbs          * gcc.target/aarch64/combine_bfxil.c: New file.         * gcc.target/aarch64/combine_bfxil_2.c: New file. On 07/16/2018 11:54 AM, Sudakshina Das wrote: > Hi Sam > > On 13/07/18 17:09, Sam Tebbs wrote: >> Hi all, >> >> This patch adds an optimisation that exploits the AArch64 BFXIL >> instruction >> when or-ing the result of two bitwise and operations with >> non-overlapping >> bitmasks (e.g. (a & 0xFFFF0000) | (b & 0x0000FFFF)). >> >> Example: >> >> unsigned long long combine(unsigned long long a, unsigned long long b) { >>    return (a & 0xffffffff00000000ll) | (b & 0x00000000ffffffffll); >> } >> >> void read2(unsigned long long a, unsigned long long b, unsigned long >> long *c, >>    unsigned long long *d) { >>    *c = combine(a, b); *d = combine(b, a); >> } >> >> When compiled with -O2, read2 would result in: >> >> read2: >>    and   x5, x1, #0xffffffff >>    and   x4, x0, #0xffffffff00000000 >>    orr   x4, x4, x5 >>    and   x1, x1, #0xffffffff00000000 >>    and   x0, x0, #0xffffffff >>    str   x4, [x2] >>    orr   x0, x0, x1 >>    str   x0, [x3] >>    ret >> >> But with this patch results in: >> >> read2: >>    mov   x4, x1 >>    bfxil x4, x0, 0, 32 >>    str   x4, [x2] >>    bfxil x0, x1, 0, 32 >>    str   x0, [x3] >>    ret >>    Bootstrapped and regtested on aarch64-none-linux-gnu and >> aarch64-none-elf with no regressions. >> > I am not a maintainer but I have a question about this patch. I may be > missing something or reading > it wrong. So feel free to point it out: > > +(define_insn "*aarch64_bfxil" > +  [(set (match_operand:DI 0 "register_operand" "=r") > +    (ior:DI (and:DI (match_operand:DI 1 "register_operand" "r") > +            (match_operand 3 "const_int_operand")) > +        (and:DI (match_operand:DI 2 "register_operand" "0") > +            (match_operand 4 "const_int_operand"))))] > +  "INTVAL (operands[3]) == ~INTVAL (operands[4]) > +    && aarch64_is_left_consecutive (INTVAL (operands[3]))" > +  { > +    HOST_WIDE_INT op4 = INTVAL (operands[4]); > +    operands[3] = GEN_INT (64 - ceil_log2 (op4)); > +    output_asm_insn ("bfxil\\t%0, %1, 0, %3", operands); > > In the BFXIL you are reading %3 LSB bits from operand 1 and putting it > in the LSBs of %0. > This means that the pattern should be masking the 32-%3 MSB of %0 and > %3 LSB of %1. So shouldn't operand 4 is LEFT_CONSECUTIVE> > > Can you please compare a simpler version of the above example you gave to > make sure the generated assembly is equivalent before and after the > patch: > > void read2(unsigned long long a, unsigned long long b, unsigned long > long *c) { >   *c = combine(a, b); > } > > > From the above text > > read2: >   and   x5, x1, #0xffffffff >   and   x4, x0, #0xffffffff00000000 >   orr   x4, x4, x5 > > read2: >   mov   x4, x1 >   bfxil x4, x0, 0, 32 > > This does not seem equivalent to me. > > Thanks > Sudi > > +    return ""; > +  } > +  [(set_attr "type" "bfx")] > +) >> gcc/ >> 2018-07-11  Sam Tebbs  >> >>          * config/aarch64/aarch64.md (*aarch64_bfxil, >> *aarch64_bfxil_alt): >>          Define. >>          * config/aarch64/aarch64-protos.h >> (aarch64_is_left_consecutive): >>          Define. >>          * config/aarch64/aarch64.c (aarch64_is_left_consecutive): >> New function. >> >> gcc/testsuite >> 2018-07-11  Sam Tebbs  >> >>          * gcc.target/aarch64/combine_bfxil.c: New file. >>          * gcc.target/aarch64/combine_bfxil_2.c: New file. >> >> >