From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 9534438582AA; Fri, 22 Dec 2023 09:04:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9534438582AA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1703235885; bh=Hw3SBPsSXt3xuoPvgvUcCBuWKd5HYwYJTHxbyqGs6FE=; h=From:To:Subject:Date:From; b=Cek3TpSsQ00uID5aBCCcuEIz69qOcNzf9xaJmnfILo26oiYf+zjeY90z14FgyuHvE 3eCU8DiiPUybDeJwi1LUgRcD2+Bn49QZOu+9raXBeWeFWvTl0x5BhA1QfOAX4ReCy0 1eJUvVRJUK2bruq5xfgZn25i31NP9JmtRf56IKxU= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/113112] New: RISC-V: Dynamic LMUL feature stabilization for GCC-14 release Date: Fri, 22 Dec 2023 09:04:42 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113112 Bug ID: 113112 Summary: RISC-V: Dynamic LMUL feature stabilization for GCC-14 release Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juzhe.zhong at rivai dot ai Target Milestone: --- Created attachment 56922 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D56922&action=3Dedit dynamic LMUL fail case Hi, as we known that we have supported dynamic LMUL feature but not stable. As far as I known, we only have these 2 execution FAILs: FAIL: gcc.dg/pr30957-1.c execution test FAIL: gcc.dg/signbit-5.c execution test in full coverage testing. And they are not the real FAIL. Tests need to be adjusted. And I have tested on K230 and other hardware, turns out we will have over 3= 0% performance improvement (compare with default LMUL =3D M1) for various benc= hmark if we can select reasonable big LMUL (no additional registers spillings). However, I also find that there are some benchmarks have significantly performance drop (compare with default LMUL =3D M1) when using dynamic LMUL. I am pretty sure because we pick the wrong big LMUL (LMUL>1) which causes additional register spillings then we have bad performance for such situati= ons. For example: #include #define N 40 int a[N]; __attribute__ ((noinline)) int foo (int n){ int i,j; int sum,x; for (i =3D 0; i < n; i++) { sum =3D 0; for (j =3D 0; j < n; j++) { sum +=3D (i + j); } a[i] =3D sum; } return 0; } -march=3Drv64gcv -mabi=3Dlp64d -O3 --param riscv-autovec-lmul=3Ddynamic --p= aram riscv-autovec-preference=3Dfixed-vlmax ASM: foo: ble a0,zero,.L11 lui a2,%hi(.LANCHOR0) addi sp,sp,-128 addi a2,a2,%lo(.LANCHOR0) mv a1,a0 vsetvli a6,zero,e32,m8,ta,ma vid.v v8 vs8r.v v8,0(sp) .L3: vl8re32.v v16,0(sp) vsetvli a4,a1,e8,m2,ta,ma li a3,0 vsetvli a5,zero,e32,m8,ta,ma vmv8r.v v0,v16 vmv.v.x v8,a4 vmv.v.i v24,0 vadd.vv v8,v16,v8 vmv8r.v v16,v24 vs8r.v v8,0(sp) .L4: addiw a3,a3,1 vadd.vv v8,v0,v16 vadd.vi v16,v16,1 vadd.vv v24,v24,v8 bne a0,a3,.L4 vsetvli zero,a4,e32,m8,ta,ma sub a1,a1,a4 vse32.v v24,0(a2) slli a4,a4,2 add a2,a2,a4 bne a1,zero,.L3 li a0,0 addi sp,sp,128 jr ra .L11: li a0,0 ret As we can see, pick up LMUL =3D 8 then spills. This case is found by this following code I add into mov pattern: if (known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR) && riscv_autovec_lmul =3D=3D RVV_DYNAMIC && lra_in_progress) gcc_unreachable (); The attachment is the file shows the cases that we pick up incorrect too big LMUL which cause addiontial spillings. I will work on this issue in the following days.=