From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 385713861802; Mon, 18 Jan 2021 14:32:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 385713861802 From: "clyon at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/98730] New: vceqzq_p64 does not generate vceq with immediate 0 Date: Mon, 18 Jan 2021 14:32:19 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: clyon at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jan 2021 14:32:20 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D98730 Bug ID: 98730 Summary: vceqzq_p64 does not generate vceq with immediate 0 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: clyon at gcc dot gnu.org Target Milestone: --- vceqzq_p64 intrinsic was introduced with commit r11-6719 (g:63999d751df9bcde4ab9107edb4c635d274b248d) defined as: vceqzq_p64 (poly64x2_t __a) { poly64x2_t __b =3D vreinterpretq_p64_u32 (vdupq_n_u32 (0)); return vceqq_p64 (__a, __b); } which is similar to what vceqz_p64 does: vceqz_p64 (poly64x1_t __a) { poly64x1_t __b =3D vreinterpret_p64_u32 (vdup_n_u32 (0)); return vceq_p64 (__a, __b); } vceqzq_p64 uses vceqq_p64 which is defined as: vceqq_p64 (poly64x2_t __a, poly64x2_t __b) { poly64_t __high_a =3D vget_high_p64 (__a); poly64_t __high_b =3D vget_high_p64 (__b); uint64x1_t __high =3D vceq_p64 (__high_a, __high_b); poly64_t __low_a =3D vget_low_p64 (__a); poly64_t __low_b =3D vget_low_p64 (__b); uint64x1_t __low =3D vceq_p64 (__low_a, __low_b); return vcombine_u64 (__low, __high); } Unlike vceqz_p64, vceqzq_p64 does not use the vceq alternative with an immediate, as is shown by the vceqzq_p64.c testcase, which generates: ldr r3, .L3 vmov.i32 q10, #0 @ v4si vld1.64 {d16-d17}, [r3:64] vceq.i32 d18, d17, d21 vceq.i32 d16, d16, d21 vpmin.u32 d18, d18, d18 vpmin.u32 d16, d16, d16 vmov.f64 d17, d18 @ int vstr d16, [r3, #16] vstr d17, [r3, #24] bx lr By comparison, vceqz_p64 generates: ldr r3, .L3 vldr.64 d16, [r3] @ int vceq.i32 d16, d16, #0 vpmin.u32 d16, d16, d16 vstr.64 d16, [r3, #8] @ int bx lr The reload trace for vceqzq_p64 say: Choosing alt 0 in insn 19: (0) =3Dw (1) w (2) w {neon_vceqv2si_insn} alt=3D0,overall=3D0,losers=3D0,rld_nregs=3D0 Choosing alt 0 in insn 15: (0) =3Dw (1) w (2) w {neon_vceqv2si_insn} alt=3D0,overall=3D0,losers=3D0,rld_nregs=3D0 (insn 19 8 15 2 (set (reg:V2SI 48 d16 [orig:128 _18 ] [128]) (neg:V2SI (eq:V2SI (reg:V2SI 48 d16 [orig:139 v1 ] [139]) (reg:V2SI 54 d19 [ _5+8 ])))) "/home/christophe.lyon/src/GCC/builds/gcc-fsf-git-neon-intrinsics/tools/lib= /gcc/arm-none-linux-gnueabihf/11.0.0/include/arm_neon.h":2404:22 1650 {neon_vceqv2si_insn} (expr_list:REG_EQUAL (neg:V2SI (eq:V2SI (subreg:V2SI (reg:DI 48 d16 [orig:139 v1 ] [139]) 0) (const_vector:V2SI [ (const_int 0 [0]) repeated x2 ]))) (nil))) (insn 15 19 20 2 (set (reg:V2SI 50 d17 [orig:121 _11 ] [121]) (neg:V2SI (eq:V2SI (reg:V2SI 50 d17 [orig:141 v2 ] [141]) (reg:V2SI 54 d19 [ _5+8 ])))) "/home/christophe.lyon/src/GCC/builds/gcc-fsf-git-neon-intrinsics/tools/lib= /gcc/arm-none-linux-gnueabihf/11.0.0/include/arm_neon.h":2404:22 1650 {neon_vceqv2si_insn} (expr_list:REG_EQUAL (neg:V2SI (eq:V2SI (subreg:V2SI (reg:DI 50 d17 [orig:141 v2 ] [141]) 0) (const_vector:V2SI [ (const_int 0 [0]) repeated x2 ]))) (nil)))=