public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/65371] New: arm loop with volatile variable
@ 2015-03-10  0:49 gcc-bugzilla at enginuities dot com
  2015-03-13  9:48 ` [Bug rtl-optimization/65371] " gcc-bugzilla at enginuities dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: gcc-bugzilla at enginuities dot com @ 2015-03-10  0:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65371

            Bug ID: 65371
           Summary: arm loop with volatile variable
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gcc-bugzilla at enginuities dot com
            Target: arm-none-eabi (Cortex-M3)

I've found this behaviour with gcc 4.8.4, 4.9.2, and 5.0.0 (20150308) (all
compiled with the same flags) on Arch Linux (3.18.2-2-ARCH x86_64).

I've also had the same behaviour using the precompiled gcc-arm-embedded
toolchain from Launchpad (gcc-arm-none-eabi-4_9-2014q4).

Compiling func.c:

#define PERIPH_BASE        0x4001000
#define PERIPH             ((PERIPH_TypeDef *) PERIPH_BASE)

typedef struct
{
    volatile unsigned long REG1;
} PERIPH_TypeDef;

void func()
{
    PERIPH->REG1 |= 0x00010000;
    while (!(PERIPH->REG1 & 0x00020000)) { }
    PERIPH->REG1 |= 0x01000000;
}

With:

arm-none-eabi-gcc -mthumb -mcpu=cortex-m3 -Os -c func.c

Results in the following disassembly of func.o:

00000000 <func>:
   0:   4b06            ldr     r3, [pc, #24]   ; (1c <func+0x1c>)
   2:   681a            ldr     r2, [r3, #0]
   4:   f442 3280       orr.w   r2, r2, #65536  ; 0x10000
   8:   601a            str     r2, [r3, #0]
   a:   6819            ldr     r1, [r3, #0]
   c:   4a03            ldr     r2, [pc, #12]   ; (1c <func+0x1c>)
   e:   0389            lsls    r1, r1, #14
  10:   d5fb            bpl.n   a <func+0xa>
  12:   6813            ldr     r3, [r2, #0]
  14:   f043 7380       orr.w   r3, r3, #16777216       ; 0x1000000
  18:   6013            str     r3, [r2, #0]
  1a:   4770            bx      lr
  1c:   04001000        streq   r1, [r0], #-0

The last line in func.c (PERIPH->REG1 |= 0x01000000;) causes the compiled code
to load the address of PERIPH->REG1 in to r2 during the loop at <func+0x0c>
(ldr     r2, [pc, #12]) and then use r2 after the loop even though the address
contained in r2 was loaded in to r3 at <func+0x00> (ldr     r3, [pc, #24]) and
doesn't change.

As such I would expect something like this disassembly of func.o:

00000000 <func>:
   0:   4b06            ldr     r3, [pc, #24]   ; (1c <func+0x1c>)
   2:   681a            ldr     r2, [r3, #0]
   4:   f442 3280       orr.w   r2, r2, #65536  ; 0x10000
   8:   601a            str     r2, [r3, #0]
   a:   681a            ldr     r2, [r3, #0]
   c:   0392            lsls    r2, r2, #14
   e:   d5fc            bpl.n   a <func+0xa>
  10:   681a            ldr     r2, [r3, #0]
  12:   f042 7280       orr.w   r2, r2, #16777216       ; 0x1000000
  16:   601a            str     r2, [r3, #0]
  18:   4770            bx      lr
  1a:   bf00            nop
  1c:   04001000        streq   r1, [r0], #-0

arm-none-eabi-gcc -v:

Using built-in specs.
COLLECT_GCC=arm-none-eabi-gcc
COLLECT_LTO_WRAPPER=/home/test_user/toolchain/libexec/gcc/arm-none-eabi/5.0.0/lto-wrapper
Target: arm-none-eabi
Configured with: /home/test_user/temp/gcc-5-20150308/gcc-5-20150308/configure
--disable-decimal-float --disable-libffi --disable-libgomp --disable-libmudflap
--disable-libquadmath --disable-libssp --disable-libstdcxx-pch --disable-nls
--disable-shared --disable-threads --disable-tls --enable-languages=c,c++
--prefix=/home/test_user/toolchain --target=arm-none-eabi
--with-gmp=/home/test_user/toolchain --with-gnu-as --with-gnu-ld
--with-mpc=/home/test_user/toolchain --with-mpfr=/home/test_user/toolchain
--with-newlib --without-headers
Thread model: single
gcc version 5.0.0 20150308 (experimental) (GCC)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/65371] arm loop with volatile variable
  2015-03-10  0:49 [Bug target/65371] New: arm loop with volatile variable gcc-bugzilla at enginuities dot com
@ 2015-03-13  9:48 ` gcc-bugzilla at enginuities dot com
  2015-06-26  8:08 ` ramana at gcc dot gnu.org
  2015-06-26  9:55 ` gcc-bugzilla at enginuities dot com
  2 siblings, 0 replies; 4+ messages in thread
From: gcc-bugzilla at enginuities dot com @ 2015-03-13  9:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65371

--- Comment #2 from Stuart <gcc-bugzilla at enginuities dot com> ---
I compiled it for x86_64 and thought it was fine, however, after your comment I
tried compiling it with clang/llvm and can see the difference (I'm not
particularly familiar with the full instruction set)...

I've found another case which could also be improved:

func.c:

#include <stdint.h>

#define PERIPH_BASE        0x40000000
#define PERIPH             ((PERIPH_TypeDef *) PERIPH_BASE)

typedef struct
{
    volatile uint32_t REG1;
} PERIPH_TypeDef;

void func(uint16_t a)
{
    uint32_t t = PERIPH->REG1;

    while ((uint16_t) (PERIPH->REG1 - t) < a) { }
}

gives:

00000000 <func>:
   0:   f04f 4380       mov.w   r3, #1073741824 ; 0x40000000
   4:   461a            mov     r2, r3
   6:   6819            ldr     r1, [r3, #0]
   8:   6813            ldr     r3, [r2, #0]
   a:   1a5b            subs    r3, r3, r1
   c:   b29b            uxth    r3, r3
   e:   4283            cmp     r3, r0
  10:   d3fa            bcc.n   8 <func+0x8>
  12:   4770            bx      lr

For some reason r3 is moved in to r2 and then value at the address in r2 is
loaded in to r3 for the loop!

I would expect the following:

00000000 <func>:
   0:   f04f 4180       mov.w   r1, #1073741824 ; 0x40000000
   4:   680a            ldr     r2, [r1, #0]
   6:   680b            ldr     r3, [r1, #0]
   8:   1a9b            subs    r3, r3, r2
   a:   b29b            uxth    r3, r3
   c:   4283            cmp     r3, r0
   e:   d3fa            bcc.n   6 <func+0x6>
  10:   4770            bx      lr


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/65371] arm loop with volatile variable
  2015-03-10  0:49 [Bug target/65371] New: arm loop with volatile variable gcc-bugzilla at enginuities dot com
  2015-03-13  9:48 ` [Bug rtl-optimization/65371] " gcc-bugzilla at enginuities dot com
@ 2015-06-26  8:08 ` ramana at gcc dot gnu.org
  2015-06-26  9:55 ` gcc-bugzilla at enginuities dot com
  2 siblings, 0 replies; 4+ messages in thread
From: ramana at gcc dot gnu.org @ 2015-06-26  8:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65371

Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-06-26
                 CC|                            |ramana at gcc dot gnu.org
      Known to work|                            |6.0
     Ever confirmed|0                           |1
      Known to fail|                            |5.1.0

--- Comment #3 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
This smells of a dup of PR65768 ... 

Trunk generates for example in comment #1.


func:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        ldr     r3, .L6
        ldr     r2, [r3]
        orr     r2, r2, #65536
        str     r2, [r3]
.L2:
        ldr     r2, [r3]
        lsls    r2, r2, #14
        bpl     .L2
        ldr     r2, [r3]
        orr     r2, r2, #16777216
        str     r2, [r3]
        bx      lr
.L7:
        .align  2
.L6:
        .word   67112960

and for the last example

func:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        mov     r3, #1073741824
        mov     r2, r3
        ldr     r1, [r3]
.L2:
        ldr     r3, [r2]
        subs    r3, r3, r1
        uxth    r3, r3
        cmp     r3, r0
        bcc     .L2
        bx      lr


May well be a dup of PR65768 ?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/65371] arm loop with volatile variable
  2015-03-10  0:49 [Bug target/65371] New: arm loop with volatile variable gcc-bugzilla at enginuities dot com
  2015-03-13  9:48 ` [Bug rtl-optimization/65371] " gcc-bugzilla at enginuities dot com
  2015-06-26  8:08 ` ramana at gcc dot gnu.org
@ 2015-06-26  9:55 ` gcc-bugzilla at enginuities dot com
  2 siblings, 0 replies; 4+ messages in thread
From: gcc-bugzilla at enginuities dot com @ 2015-06-26  9:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65371

--- Comment #4 from Stuart <gcc-bugzilla at enginuities dot com> ---
The assembly generated from Comment #1 looks good.

However, the assembly generated from Comment #3 hasn't improved, it still
contains the unnecessary mov instruction on line 2 (mov r2, r3).

The first instruction movs the address in to r3 and the second movs r3 in to
r2. The instruction at label .L2 loads data in to r3 from the address in r2,
overwriting r3 and making the instruction on line 2 unnecessary...

I would have expected to see:

func:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        mov     r2, #1073741824
        ldr     r1, [r2]
.L2:
        ldr     r3, [r2]
        subs    r3, r3, r1
        uxth    r3, r3
        cmp     r3, r0
        bcc     .L2
        bx      lr


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-06-26  9:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-10  0:49 [Bug target/65371] New: arm loop with volatile variable gcc-bugzilla at enginuities dot com
2015-03-13  9:48 ` [Bug rtl-optimization/65371] " gcc-bugzilla at enginuities dot com
2015-06-26  8:08 ` ramana at gcc dot gnu.org
2015-06-26  9:55 ` gcc-bugzilla at enginuities dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).