public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/107957] New: Missed optimization in access to upper-half of a variable
@ 2022-12-03 15:41 mrjjot at gmail dot com
  2022-12-05 16:11 ` [Bug target/107957] [AVR] " mrjjot at gmail dot com
  0 siblings, 1 reply; 2+ messages in thread
From: mrjjot at gmail dot com @ 2022-12-03 15:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107957

            Bug ID: 107957
           Summary: Missed optimization in access to upper-half of a
                    variable
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mrjjot at gmail dot com
  Target Milestone: ---

Hello, 

I think I've found an optimization opportunity for AVR GCC. This might be
similar to bug 66511, but also affects variables smaller than 64 bits. Please
consider the following C code:

uint64_t x;
uint32_t y;
uint16_t z;
uint8_t w;

void foo(void)
{
    y = x >> 32;
}

void bar(void)
{
    z = y >> 16;
}

void rawr(void)
{
    w = z >> 8;
}

As you can see, all three functions just assign upper half of one variable to
the another. When compiled with avr-gcc and -Wall -Wextra and -O3 flags, the
following assembly is produced:

foo():
        push r16
        lds r18,x
        lds r19,x+1
        lds r20,x+2
        lds r21,x+3
        lds r22,x+4
        lds r23,x+5
        lds r24,x+6
        lds r25,x+7
        ldi r16,lo8(32)
        rcall __lshrdi3
        sts y,r18
        sts y+1,r19
        sts y+2,r20
        sts y+3,r21
        pop r16
        ret
bar():
        lds r24,y
        lds r25,y+1
        lds r26,y+2
        lds r27,y+3
        sts z+1,r27
        sts z,r26
        ret
rawr():
        lds r24,z+1
        sts w,r24
        ret

I'm not a compiler expert, but I'd say that this is a missed optimization. In
every case there are twice as many lds operations as needed. For comparison,
GCC for x86_64 does generate code which performs DWORD read in foo(), WORD read
in bar() and BYTE read in rawr(). I've found that the following definitions
generate identical assembly on x86_64 and more optimal assembly on AVR:

void foo2(void)
{
    y = ((uint32_t*)&x)[1];
}

void bar2(void)
{
    z = ((uint16_t*)&y)[1];
}

void rawr2(void)
{
    w = ((uint8_t*)&z)[1];
}

foo2():
        lds r24,x+4
        lds r25,x+4+1
        lds r26,x+4+2
        lds r27,x+4+3
        sts y,r24
        sts y+1,r25
        sts y+2,r26
        sts y+3,r27
        ret
bar2():
        lds r24,y+2
        lds r25,y+2+1
        sts z+1,r25
        sts z,r24
        ret
rawr2():
        lds r24,z+1
        sts w,r24
        ret

I've checked my local installation of AVR GCC 12.2.0 on Manjaro and different
AVR GCC versions on Godbolt. They all seem to produce the same machine code.

$ avr-gcc -v
Using built-in specs.
Reading specs from /usr/lib/gcc/avr/12.2.0/device-specs/specs-avr2
COLLECT_GCC=avr-gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/avr/12.2.0/lto-wrapper
Target: avr
Configured with: /build/avr-gcc/src/gcc-12.2.0/configure
--disable-install-libiberty --disable-libssp --disable-libstdcxx-pch
--disable-libunwind-exceptions --disable-linker-build-id --disable-nls
--disable-werror --disable-__cxa_atexit --enable-checking=release
--enable-clocale=gnu --enable-gnu-unique-object --enable-gold
--enable-languages=c,c++ --enable-ld=default --enable-lto --enable-plugin
--enable-shared --infodir=/usr/share/info --libdir=/usr/lib
--libexecdir=/usr/lib --mandir=/usr/share/man --prefix=/usr --target=avr
--with-as=/usr/bin/avr-as --with-gnu-as --with-gnu-ld --with-ld=/usr/bin/avr-ld
--with-plugin-ld=ld.gold --with-system-zlib --with-isl
--enable-gnu-indirect-function
Thread model: single
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.0 (GCC)

I'd appreciate if you could look into this. Thank you!

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug target/107957] [AVR] Missed optimization in access to upper-half of a variable
  2022-12-03 15:41 [Bug target/107957] New: Missed optimization in access to upper-half of a variable mrjjot at gmail dot com
@ 2022-12-05 16:11 ` mrjjot at gmail dot com
  0 siblings, 0 replies; 2+ messages in thread
From: mrjjot at gmail dot com @ 2022-12-05 16:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107957

--- Comment #1 from Jacek Wieczorek <mrjjot at gmail dot com> ---
A little correction - I just noticed my mistake. Assembly for rawr() is in fact
correct. Apparently this happens only for variables longer than 16 bits.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-12-05 16:11 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-03 15:41 [Bug target/107957] New: Missed optimization in access to upper-half of a variable mrjjot at gmail dot com
2022-12-05 16:11 ` [Bug target/107957] [AVR] " mrjjot at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).