From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 0B3D038708D1; Sat, 15 Jun 2024 08:13:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0B3D038708D1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1718439202; bh=KJYT7ZKkRM1UMg6YiqFnuyX8KiGOJUjVUERswSCU4zQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=mXHtHy8U5+dhZGYucwlHBwctloLv3A5qn1c8LFCeiG79y7BgFsnoUmpTSJ970aUKC cFYJLjGJ/YVJFWEjpRrZm6vjzPVO52iIPD/Ipl4I9hKwqFWGkb4ZQLY+xvaQywdL+b q6zHWrMmZUGw/ASp2v+XBngXH8Lx78W3iOINYScQ= From: "lis8215 at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/111376] missed optimization of one bit test on MIPS32r1 Date: Sat, 15 Jun 2024 08:13:20 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: lis8215 at gmail dot com X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: INVALID X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: syq at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111376 --- Comment #15 from Siarhei Volkau --- Created attachment 58437 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D58437&action=3Dedit application to test performance of shift Here is the test application (MIPS32 specific) I wrote. It allows to detect execution cycles and extra pipeline stalls for SLL if t= hey take place. for XBurst 1 (jz4725b) result is the following: `SLL to use latency test` execution median: 168417 ns, min: 168416 ns `SLL to use latency test with nop` execution median: 196250 ns, min: 196166= ns `SLL to branch latency test` execution median: 196250 ns, min: 196166 ns `SLL to branch latency test with nop` execution median: 224000 ns, min: 224= 000 ns `SLL by 7 to use latency test` execution median: 168417 ns, min: 168416 ns `SLL by 15 to use latency test` execution median: 168417 ns, min: 168416 ns `SLL by 23 to use latency test` execution median: 168417 ns, min: 168416 ns `SLL by 31 to use latency test` execution median: 168417 ns, min: 168416 ns `LUI>AND>BEQZ reference test` execution median: 196250 ns, min: 196166 ns `SLL>BGEZ reference test` execution median: 168417 ns, min: 168416 ns and what does it mean: `SLL to use latency test` 168417 ns and `.. with nop` 196250 ns means that there's no extra stall cycles between SLL and further use by ALU operation. `SLL to branch latency test` and `.. with nop` result means that there's no extra stall cycles between SLL and further use by bra= nch operations. `SLL by N` results means that SLL execution time doesn't depend on shift amount. and finally, the reference test results showcases that SLL>BGEZ approach is faster than LUI>AND>BEQZ.=