From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 4DB883858D28; Fri, 19 May 2023 10:10:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4DB883858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684491011; bh=+7lL44ZF2LRSxpUb3+rEM6gBR8MK8oJs8xTBQz17Tyw=; h=From:To:Subject:Date:From; b=xBm2YMtEpgZhpBM+H5Oa6HFh/lb6TZy55fEle9fGu2t2fv1e3cD+VUN9JGW14H/v5 nOB+3Ahra3XDzBE+7q2528XmgSLNEBWlbDmM7tm7xwbryETp98XvFwYuwXC7MbKnAD XuLWNWDmMWyeRbKLybMBgFeM9hvYrp1Gr/MdSL14= From: "gjl at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/109907] New: [avr] Missed optimization for bit extraction (uses shift instead of single bit-test) Date: Fri, 19 May 2023 10:10:10 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: gjl at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109907 Bug ID: 109907 Summary: [avr] Missed optimization for bit extraction (uses shift instead of single bit-test) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 55116 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D55116&action=3Dedit C test case. The following missed optimization occurs with current v14 master and also w= ith older versions of the compiler: $ avr-gcc ext.c -dumpbase "" -save-temps -dp -mmcu=3Datmega128 -c -Os Functons like uint8_t cset_32bit31 (uint32_t x) { return (x & (1ul << 31)) ? 1 : 0; // bloat } that extract a single bit might generate very expensive code like: cset_32bit31: movw r26,r24 ; 18 [c=3D4 l=3D1] *movhi/0 movw r24,r22 ; 19 [c=3D4 l=3D1] *movhi/0 lsl r27 ; 24 [c=3D16 l=3D4] *ashrsi3_const/3 sbc r24,r24 mov r25,r24 movw r26,r24 andi r24,lo8(1) ; 12 [c=3D4 l=3D1] andqi3/1 ret ; 22 [c=3D0 l=3D1] return where the following 3 instructions would suffice. This is smaller, faster = and imposes no additioal register pressure: bst r25,7 ; 16 [c=3D4 l=3D3] *extzv/4 clr r24 bld r24,0 What also would work is loading 0 or 1 depending on a single bit like: LDI r24, 0 # R24 =3D 0 SBRC r25, 7 # Skip next instruction if R25.7 =3D=3D 0. LDI r24, 1 # R24 =3D 1 The bloat also occurs when the complement of the bit is extracted like in uint8_t cset_32bit30_not (uint32_t x) { return (x & (1ul << 30)) ? 0 : 1; // bloat=20 } cset_32bit30_not: movw r26,r24 ; 19 [c=3D4 l=3D1] *movhi/0 movw r24,r22 ; 20 [c=3D4 l=3D1] *movhi/0 ldi r18,30 ; 25 [c=3D44 l=3D7] *lshrsi3_const/3 1:=20=20=20=20=20=20 lsr r27 ror r26 ror r25 ror r24 dec r18=20 brne 1b=20 ldi r18,1 ; 7 [c=3D32 l=3D2] xorsi3/2 eor r24,r18 andi r24,lo8(1) ; 13 [c=3D4 l=3D1] andqi3/1 ret ; 23 [c=3D0 l=3D1] return This case is even worse because it's a loop of 30 single bit-shifts to extr= act the bit. Again, skipping one instrauction depending on a bit was possible: LDI r24, 1 # R24 =3D 1 SBRC r25, 6 # Skip next instruction if R25.7 =3D=3D 0. LDI r24, 0 # R24 =3D 0 or LDI r24, 0 # R24 =3D 0 SBRS r25, 6 # Skip next instruction if R25.7 =3D=3D 1. LDI r24, 1 # R24 =3D 1 or extract one bit using the T-flag: BST r25, 6 # SREG.T =3D R25.6 LDI r24, 0xff # R24 =3D 0xff BLD r24, 0 # R24.0 =3D SREG.T COM r24 # R24 =3D R24 ^ 0xff ------------------------------------------------------- Configured with: --target=3Davr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --enable-languages=3Dc,c++ gcc version 14.0.0 20230518 (experimental) (GCC)=