From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 315A5385840A; Thu, 19 Oct 2023 12:08:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 315A5385840A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1697717324; bh=mvlr0F8ktF5pFcCB1SHWhcKB+pnAnD5UffiSMqnmQ3A=; h=From:To:Subject:Date:In-Reply-To:References:From; b=XtD0+wMwTgMht8FqcgFVpp/hBAErOb4uoqqXCn254suuHSiX7s+pQXeIO3pv8/s6v s5ONQsuL4x4s8Li/Fz46cv1nLAdbo5RkEU6MXWmU2vK9nwbSZOwj+4nZD+47SOpnbQ yBO4vg0Ll+kbRdAyxQUdWG2RYhquHMZlDPdZueuo= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/111720] RISC-V: Ugly codegen in RVV Date: Thu, 19 Oct 2023 12:08:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111720 --- Comment #23 from JuzheZhong --- (In reply to rguenther@suse.de from comment #22) > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote: >=20 > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111720 > >=20 > > --- Comment #21 from JuzheZhong --- > > (In reply to rguenther@suse.de from comment #20) > > > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote: > > >=20 > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111720 > > > >=20 > > > > --- Comment #19 from JuzheZhong --- > > > > (In reply to Richard Biener from comment #18) > > > > > With RVV you have intrinsic calls in GIMPLE so nothing to optimiz= e: > > > > >=20 > > > > > vbool8_t fn () > > > > > { > > > > > vbool8_t vmask; > > > > > vuint8m1_t vand_m; > > > > > vuint8m1_t varr; > > > > > uint8_t arr[32]; > > > > >=20 > > > > > [local count: 1073741824]: > > > > > arr =3D > > > > > "\x01\x02\x07\x01\x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t\x01= \x02\x07\x01 > > > > > \x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t"; > > > > > varr_3 =3D __riscv_vle8_v_u8m1 (&arr, 32); [return slot optimiz= ation] > > > > > vand_m_4 =3D __riscv_vand_vx_u8m1 (varr_3, 1, 32); [return slot= optimization] > > > > > vmask_5 =3D __riscv_vreinterpret_v_u8m1_b8 (vand_m_4); [return = slot > > > > > optimization] > > > > > =3D vmask_5; > > > > > arr =3D{v} {CLOBBER(eol)}; > > > > > return ; > > > > >=20 > > > > > and on RTL I see lots of UNSPECs, RTL opts cannot do anything wit= h those. > > > > >=20 > > > > > This is what Andrew said already. > > > >=20 > > > > Ok. I wonder why this issue is gone when I change it into: > > > >=20 > > > > arr as static > > > >=20 > > > > https://godbolt.org/z/Tdoshdfr6 > > >=20 > > > Because the stacik initialization isn't required then. > >=20 > > I have experiment with a simplifed pattern: > >=20 > >=20 > > (insn 14 13 15 2 (set (reg/v:RVVM1QI 134 [ varr ]) > > (if_then_else:RVVM1QI (unspec:RVVMF8BI [ > > (const_vector:RVVMF8BI repeat [ > > (const_int 1 [0x1]) > > ]) > > (reg:DI 143) > > (const_int 2 [0x2]) repeated x2 > > (const_int 0 [0]) > > (reg:SI 66 vl) > > (reg:SI 67 vtype) > > ] UNSPEC_VPREDICATE) > > (mem:RVVM1QI (reg:DI 142) [0 S[16, 16] A8]) > > (const_vector:RVVM1QI repeat [ > > (const_int 0 [0]) > > ]))) "rvv.c":5:23 1476 {*pred_movrvvm1qi} > > (nil)) > > (insn 15 14 16 2 (set (reg:DI 144) > > (const_int 32 [0x20])) "rvv.c":6:5 206 {*movdi_64bit} > > (nil)) > > (insn 16 15 0 2 (set (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0 S[16, 16= ] A8]) > > (if_then_else:RVVM1QI (unspec:RVVMF8BI [ > > (const_vector:RVVMF8BI repeat [ > > (const_int 1 [0x1]) > > ]) > > (reg:DI 144) > > (const_int 0 [0]) > > (reg:SI 66 vl) > > (reg:SI 67 vtype) > > ] UNSPEC_VPREDICATE) > > (reg/v:RVVM1QI 134 [ varr ]) > > (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0 S[16, 16] A8]))) > > "rvv.c":6:5 1592 {pred_storervvm1qi} > > (nil)) > >=20 > > You can see there is only one UNSPEC now. Still has redundant stack > > transferring. > >=20 > > Is it because the pattern too complicated? >=20 > It's because it has an UNSPEC in it - that makes it have target > specific (unknown to the middle-end) behavior so nothing can > be optimized here. >=20 > Specifically passes likely refuse to replace MEM operands in > such a construct. I saw ARM SVE load/store intrinsic also have UNSPEC. They don't have such issues. https://godbolt.org/z/fsW6Ko93z But their patterns are much simplier than RVV patterns.=20 I am still trying find a way to optimize the RVV pattern for that. However, it seems to be very diffcult since we are trying to merge each type intrinsics into same single pattern to avoid explosion of the insn-ouput.cc and insn-emit.cc=