From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 5D4F73858D3C; Tue, 12 Sep 2023 11:44:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5D4F73858D3C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694519099; bh=ecD5pTWIXRlZlG/0TFDNxCkcUsLztHctYpsrRAPfU6o=; h=From:To:Subject:Date:In-Reply-To:References:From; b=ZEAytslH3Wi7HVNu0g2uud77lwE8t8spJhgg2PXtgofeA4P8Gm6jERAzn6Re8fFiW 1OK0C0FFp66LsFT1/v1DreEaDa+yCh8YGybhfIyImEsETh9UqbhEtNFYHRF+vdg5/V nY2UDHhJPoyS8kxY2z7gPV9oJeFLuWeDwOYEgrAk= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA Date: Tue, 12 Sep 2023 11:44:57 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110751 --- Comment #23 from JuzheZhong --- Hi, Richard and Richi. I found a way to simulate "undefine" in COND_LEN_xxx pattern for the ELSE v= alue that doesn't matter. First, return size type 0 in else_value target hook: /* Use size_type 0 which is represented as const0_rtx in RTL to simulate undefine else value since GCC doesn't undefine value in TREE/GIMPLE representation. TODO: We may will need to support undefine value in TREE/GIMPLE middle-e= nd IR. But current approach is good enough for RVV codegen/performance. */ static tree riscv_preferred_else_value (unsigned ifn, tree vectype, unsigned int nops, tree *ops) { if (riscv_v_ext_mode_p (TYPE_MODE (vectype))) return build_zero_cst (size_type_node); return default_preferred_else_value (ifn, vectype, nops, ops); } Note that we can't return VECTOR_CST with all 0.=20 Since a VECTROR_CST with all 0 may matter and the real value we need. So, to simulate "undefine", I pass a '0' which will be represented as const0_rtx in RTX. So the IR will be: vect__7.12_8 =3D .COND_LEN_DIV ({ -1, ... }, vect__4.8_22, vect__6.11_9, 0 (undefine ELSE value), _37, 0); Then I relax the predicate in COND_LEN_xxx pattern. It works and pass all the tests. Consider this following case: void foo (int32_t *__restrict a, int32_t *__restrict b, int n) { for (int i =3D 0; i < n; i++) a[i] =3D a[i] / b[i]; } Before: foo: ble a2,zero,.L5 mv a4,a0 vsetvli a5,zero,e32,m8,ta,ma vmv.v.i v4,0 .L3: vsetvli a5,a2,e32,m8,tu,ma vmv8r.v v1,v4 slli a3,a5,2 vle32.v v3,0(a0) vle32.v v2,0(a1) sub a2,a2,a5 vdiv.vv v1,v3,v2 vse32.v v1,0(a4) add a0,a0,a3 add a1,a1,a3 add a4,a4,a3 bne a2,zero,.L3 .L5: ret After: foo: ble a2,zero,.L5 mv a4,a0 .L3: vsetvli a5,a2,e32,m8,ta,ma slli a3,a5,2 vle32.v v8,0(a0) vle32.v v16,0(a1) sub a2,a2,a5 vdiv.vv v8,v8,v16 vse32.v v8,0(a4) add a0,a0,a3 add a1,a1,a3 add a4,a4,a3 bne a2,zero,.L3 .L5: ret Not so elegant. But it does fix the performance/codegen issue in RVV.=