Hi, For the following test-case: #include uint8x8_t f1(int8x8_t a, int8x8_t b) { return (uint8x8_t) ((a & b) != 0); } gcc fails to lower test operation to vtst, and instead emits: f1: vand d0, d0, d1 vceq.i8 d0, d0, #0 vmvn d0, d0 bx lr The attached patch tries to fix this by adding a pattern to match this combine: Trying 7, 8 -> 9: 7: r120:V8QI=r123:V8QI&r124:V8QI REG_DEAD r124:V8QI REG_DEAD r123:V8QI 8: r122:V8QI=-r120:V8QI==const_vector REG_DEAD r120:V8QI 9: r121:V8QI=~r122:V8QI REG_DEAD r122:V8QI Failed to match this instruction: (set (reg:V8QI 121) (plus:V8QI (eq:V8QI (and:V8QI (reg:V8QI 123) (reg:V8QI 124)) (const_vector:V8QI [ (const_int 0 [0]) repeated x8 ])) (const_vector:V8QI [ (const_int -1 [0xffffffffffffffff]) repeated x8 ]))) Essentially it converts: r120 = (and r123 r124) r122 = (neg (eq r120 0)) r121 = (not r122) --> r121 = vtst r123, r124 (I guess it simplifies (not (neg X)) to (plus X -1) above). Code-gen after patch: f1: vtst.8 d0, d0, d1 bx lr Bootstrapped + tested on arm-linux-gnueabihf, and cross tested on arm*-*-*. Does it look OK for next stage-1 ? Thanks, Prathamesh