From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id D9A933858C50; Tue, 18 Apr 2023 22:24:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D9A933858C50 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1681856655; bh=MZEh1WaT95kbPBKQ9zVqUDNVzhq1J/S4wHLust1zlL0=; h=From:To:Subject:Date:From; b=kjKpf4HmiZ+ExVXuaptR2uGMpODvXca+7C0KeEYd4w5XxoaiKMGJAYZJPtys69rHd Kfp2lIhHq96EsXarQNcZYWjXC+/OC5ohRvF6By5XPBCq40CganSN1WhSH+VaClx1+Y 4Qo7UyStTk70BooVn3ZpJ/fIPf600Zb0uNQFd2n4= From: "palmer at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/109547] New: RISC-V: Multiple vsetvli for load/store loop Date: Tue, 18 Apr 2023 22:24:15 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: palmer at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109547 Bug ID: 109547 Summary: RISC-V: Multiple vsetvli for load/store loop Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: palmer at gcc dot gnu.org Target Milestone: --- I was just poking around with a simple loop using the vector intrinsics and found some odd generated code. This is on the gcc-13 branch, but that's pr= etty close to trunk so I'm filing it for 14. I'm probably not going to have tim= e to look for a bit, as it seems to just be a performance issue. $ cat test.c #include void func(unsigned char *out, unsigned char *in, unsigned long len) { unsigned long i =3D 0; while (i < len) { unsigned long vl =3D __riscv_vsetvl_e8m1(len - i); vuint8m1_t r =3D __riscv_vle8_v_u8m1(in + i, vl); __riscv_vse8_v_u8m1(out + i, r, vl); i +=3D vl; } } $ ../toolchain/install/bin/riscv64-unknown-linux-gnu-gcc test.c -O3 -c -S -= o- -march=3Drv64gcv -fdump-rtl-all .file "test.c" .option nopic .attribute arch, "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_v1p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64= f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0" .attribute unaligned_access, 0 .attribute stack_align, 16 .text .align 1 .globl func .type func, @function func: .LFB2: .cfi_startproc beq a2,zero,.L1 li a5,0 .L3: sub a4,a2,a5 add a6,a1,a5 add a3,a0,a5 vsetvli a4,a4,e8,m1,ta,mu vsetvli zero,a4,e8,m1,ta,ma add a5,a5,a4 vle8.v v24,0(a6) vse8.v v24,0(a3) bgtu a2,a5,.L3 .L1: ret .cfi_endproc .LFE2: .size func, .-func .ident "GCC: (g85b95ea729c) 13.0.1 20230417 (prerelease)" .section .note.GNU-stack,"",@progbits=