From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 488CB3858298; Tue, 17 Oct 2023 08:27:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 488CB3858298 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1697531224; bh=H5RM5OnHgiJ3MBZ/kqS8kS74drY7RTuAykhnAUyj15Q=; h=From:To:Subject:Date:In-Reply-To:References:From; b=A+hlZiYHLSS13sBmc9Mc+isM5ZSnfk0cNzCjLlD9BOw1tqmYKCeTI9DU2Wd5VkCbN 2uht9mF4SEcGbpFR1tUdObek2ysHBA//aGmKnJnmln4HXJz5wKZPbjntV9qdGUaTb0 SALcowJD1gASn/wm7n9udo9qjoj/f5hZDLS1cM+c= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/111720] RISC-V: Ugly codegen in RVV Date: Tue, 17 Oct 2023 08:26:49 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111720 --- Comment #13 from JuzheZhong --- Confirm ARM SVE has the same issue: https://godbolt.org/z/TjcaM6xsP #include void fn(uint8_t * __restrict out) { uint8_t arr[32] =3D {1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9, 1,= 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9}; uint8_t m =3D 1; svint8_t varr =3D *(svint8_t*)arr; *(svint8_t*)out =3D varr; } ARM GCC: fn: adrp x1, .LANCHOR0 add x1, x1, :lo12:.LANCHOR0 sub sp, sp, #32 ptrue p7.b, all ldp q31, q30, [x1] -----> redundant stack spillings. stp q31, q30, [sp] -----> redundant stack spillings. ld1b z31.b, p7/z, [sp] st1b z31.b, p7, [x0] add sp, sp, 32 ret ARM clang: fn: // @fn ptrue p0.b adrp x8, .L__const.fn.arr add x8, x8, :lo12:.L__const.fn.arr ld1b { z0.b }, p0/z, [x8] st1b { z0.b }, p0, [x0] ret Hi, Richard. Could you comment this issue ? Thanks.=