From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 31CBE3858D39; Sat, 3 Dec 2022 15:41:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 31CBE3858D39 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1670082071; bh=6GiQ8r1LXpY35H5ZNQxyG28eqB1w3PLDJGVKtF2j8ik=; h=From:To:Subject:Date:From; b=G9fMZcZI9GsYOkMSam9ZPgGEeBABEmGXn0muOPpon+e0Fj1g55y0QrEDU4f9/su+b qzxu3fWUTh2KDIeBZ2e5WSRTEKrt8Vf24KpNEvXrV/TkbTgbxDE0lB7E32o4uws3cq 820VgsQmA31HL7+jdRzk34UHdFk5Zaz9dQH5/86M= From: "mrjjot at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/107957] New: Missed optimization in access to upper-half of a variable Date: Sat, 03 Dec 2022 15:41:10 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.2.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: mrjjot at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107957 Bug ID: 107957 Summary: Missed optimization in access to upper-half of a variable Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: mrjjot at gmail dot com Target Milestone: --- Hello,=20 I think I've found an optimization opportunity for AVR GCC. This might be similar to bug 66511, but also affects variables smaller than 64 bits. Plea= se consider the following C code: uint64_t x; uint32_t y; uint16_t z; uint8_t w; void foo(void) { y =3D x >> 32; } void bar(void) { z =3D y >> 16; } void rawr(void) { w =3D z >> 8; } As you can see, all three functions just assign upper half of one variable = to the another. When compiled with avr-gcc and -Wall -Wextra and -O3 flags, the following assembly is produced: foo(): push r16 lds r18,x lds r19,x+1 lds r20,x+2 lds r21,x+3 lds r22,x+4 lds r23,x+5 lds r24,x+6 lds r25,x+7 ldi r16,lo8(32) rcall __lshrdi3 sts y,r18 sts y+1,r19 sts y+2,r20 sts y+3,r21 pop r16 ret bar(): lds r24,y lds r25,y+1 lds r26,y+2 lds r27,y+3 sts z+1,r27 sts z,r26 ret rawr(): lds r24,z+1 sts w,r24 ret I'm not a compiler expert, but I'd say that this is a missed optimization. = In every case there are twice as many lds operations as needed. For comparison, GCC for x86_64 does generate code which performs DWORD read in foo(), WORD = read in bar() and BYTE read in rawr(). I've found that the following definitions generate identical assembly on x86_64 and more optimal assembly on AVR: void foo2(void) { y =3D ((uint32_t*)&x)[1]; } void bar2(void) { z =3D ((uint16_t*)&y)[1]; } void rawr2(void) { w =3D ((uint8_t*)&z)[1]; } foo2(): lds r24,x+4 lds r25,x+4+1 lds r26,x+4+2 lds r27,x+4+3 sts y,r24 sts y+1,r25 sts y+2,r26 sts y+3,r27 ret bar2(): lds r24,y+2 lds r25,y+2+1 sts z+1,r25 sts z,r24 ret rawr2(): lds r24,z+1 sts w,r24 ret I've checked my local installation of AVR GCC 12.2.0 on Manjaro and differe= nt AVR GCC versions on Godbolt. They all seem to produce the same machine code. $ avr-gcc -v Using built-in specs. Reading specs from /usr/lib/gcc/avr/12.2.0/device-specs/specs-avr2 COLLECT_GCC=3Davr-gcc COLLECT_LTO_WRAPPER=3D/usr/lib/gcc/avr/12.2.0/lto-wrapper Target: avr Configured with: /build/avr-gcc/src/gcc-12.2.0/configure --disable-install-libiberty --disable-libssp --disable-libstdcxx-pch --disable-libunwind-exceptions --disable-linker-build-id --disable-nls --disable-werror --disable-__cxa_atexit --enable-checking=3Drelease --enable-clocale=3Dgnu --enable-gnu-unique-object --enable-gold --enable-languages=3Dc,c++ --enable-ld=3Ddefault --enable-lto --enable-plug= in --enable-shared --infodir=3D/usr/share/info --libdir=3D/usr/lib --libexecdir=3D/usr/lib --mandir=3D/usr/share/man --prefix=3D/usr --target= =3Davr --with-as=3D/usr/bin/avr-as --with-gnu-as --with-gnu-ld --with-ld=3D/usr/bi= n/avr-ld --with-plugin-ld=3Dld.gold --with-system-zlib --with-isl --enable-gnu-indirect-function Thread model: single Supported LTO compression algorithms: zlib zstd gcc version 12.2.0 (GCC) I'd appreciate if you could look into this. Thank you!=