From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id AFD0638708D1; Sat, 15 Jun 2024 06:47:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AFD0638708D1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1718434027; bh=JVUauCJLcVooDMVutS6AqGJPAaXERr0qOA3/Pu/Y94M=; h=From:To:Subject:Date:In-Reply-To:References:From; b=IIib8vLx10WKJAz8vY6mSkZX+wp8V8S6aXQITeRav9//7OHpn+rKY0acglDJa5Tsu F+tAWDFBuBRiq6UvKOlvioixJC8kF5/K32xr8dOgMqgiEurBL5lhh4+OBDNmINZGrB G7na0gSv4bMxdnC0bh1hxe+M5K1l5RZhhyZTVAFc= From: "syq at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/111376] missed optimization of one bit test on MIPS32r1 Date: Sat, 15 Jun 2024 06:47:06 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: syq at gcc dot gnu.org X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: INVALID X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: syq at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111376 --- Comment #13 from YunQiang Su --- I try to insert=20 li $3, 500 li $5, 500 between SLL/BGEZ and LUI+AND/BNE. The later is still some faster on Loongson 3A4000. I notice something like this in 74K's software manual: The 74K core=E2=80=99s ALU is pipelined. Some ALU instructions complete the= operation and bypass the results in this cycle. These instructions are referred to as single-cycle ops and they include all logical instructions (AND, ANDI, OR, = ORI, XOR, XORI, LUI), some shift instructions (SLL sa<=3D8, SRL 31<=3Dsa<=3D25),= and some arithmetic instructions (ADD rt=3D0, ADDU rt=3D0, SLT, SLTI, SLTU, SLTIU, S= EH, SEB, ZEH, ZEB). In addition, add instructions (ADD, ADDU, ADDI, ADDIU) complete = the operation and bypass results to the ALU pipe in this cycle. I guess it means that if sa>8, SLL may be some slow. On Loongson 3A4000, the value seems to be 20/21. It may means that we shoul= d be care about for 64bit. Can you have a test on XBurst 1?=