public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7
@ 2012-02-28 10:40 jay.foad at gmail dot com
  2012-02-28 11:51 ` [Bug target/52415] " rguenth at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: jay.foad at gmail dot com @ 2012-02-28 10:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

             Bug #: 52415
           Summary: memcpy to local variable generates unnecessary stack
                    frame for armv7
    Classification: Unclassified
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jay.foad@gmail.com


This test case uses memcpy to load the bytes of a float into an int:

$ cat m.c
int f(float *p) {
    int i;
    __builtin_memcpy(&i, p, sizeof i);
    return i;
}

When compiled for armv7 I get:

$ cc1 -o - -O3 m.c -quiet -march=armv7 -mthumb
...
f:
    @ args = 0, pretend = 0, frame = 4
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr    r0, [r0, #0]    @ unaligned
    sub    sp, sp, #4
    str    r0, [sp, #0]    @ unaligned
    ldr    r0, [sp, #0]
    add    sp, sp, #4
    bx    lr
...

The stack frame is unnecessary; it could compile to just:

    ldr    r0, [r0, #0]
    bx    lr

I'm using trunk rev 184597, configured with --target=arm-elf.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
@ 2012-02-28 11:51 ` rguenth at gcc dot gnu.org
  2012-02-28 12:02 ` jay.foad at gmail dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-02-28 11:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-02-28
     Ever Confirmed|0                           |1

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-02-28 11:45:04 UTC ---
This is I believe because ARM is a STRICT_ALIGNMENT target and we thus do
not transform the memcpy on the tree level.  On the tree level nothing
guarantees that 'p' is properly aligned.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
  2012-02-28 11:51 ` [Bug target/52415] " rguenth at gcc dot gnu.org
@ 2012-02-28 12:02 ` jay.foad at gmail dot com
  2012-02-28 12:17 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: jay.foad at gmail dot com @ 2012-02-28 12:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

--- Comment #2 from Jay Foad <jay.foad at gmail dot com> 2012-02-28 11:51:00 UTC ---
> On the tree level nothing guarantees that 'p' is properly aligned.

This is a digression, but what about C99 (Committee Draft -- April 12, 2011)
6.3.2.3p7:

"A pointer to an object type may be converted to a pointer to a different
object type. If the resulting pointer is not correctly aligned for the
referenced type, the behavior is undefined."

Doesn't that guarantee that p is properly aligned? If not, how can I assert
that p *is* properly aligned, so the compiler can turn the memcpy into an
aligned load? Thanks.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
  2012-02-28 11:51 ` [Bug target/52415] " rguenth at gcc dot gnu.org
  2012-02-28 12:02 ` jay.foad at gmail dot com
@ 2012-02-28 12:17 ` rguenth at gcc dot gnu.org
  2012-02-28 12:39 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-02-28 12:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-02-28 12:15:53 UTC ---
(In reply to comment #2)
> > On the tree level nothing guarantees that 'p' is properly aligned.
> 
> This is a digression, but what about C99 (Committee Draft -- April 12, 2011)
> 6.3.2.3p7:
> 
> "A pointer to an object type may be converted to a pointer to a different
> object type. If the resulting pointer is not correctly aligned for the
> referenced type, the behavior is undefined."
> 
> Doesn't that guarantee that p is properly aligned? If not, how can I assert
> that p *is* properly aligned, so the compiler can turn the memcpy into an
> aligned load? Thanks.

Well, not exactly.  The GIMPLE IL does not map 1:1 to the C99 spec (after
all it has to support other languages besides C).  There is no convenient
way for a frontend to tell the middle-end that 'p' is properly aligned
for a float.  Note that the situation is complicated by the fact that,
as GCC extension, you can create a misaligned variant

typedef float my_float __attribute__((aligned(1)));

int f(my_float *p) {
  int i;
  __builtin_memcpy(&i, p, sizeof i);
  return i;
}

which needs to be handled correctly as well (we don't on STRICT_ALIGNMENT
targets).  The my_float * case is of course not covered by C99.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
                   ` (2 preceding siblings ...)
  2012-02-28 12:17 ` rguenth at gcc dot gnu.org
@ 2012-02-28 12:39 ` jakub at gcc dot gnu.org
  2012-02-28 13:07 ` jay.foad at gmail dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-02-28 12:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-02-28 12:29:14 UTC ---
But to answer your question, how you can assert it is properly aligned, in gcc
4.7.0 you can write:
  __builtin_memcpy (&i, __builtin_assume_aligned (p, sizeof *p), sizeof i);
which will assert that p is sizeof (float) bytes aligned and then you get code
without the stack frame.  Or you can use a union instead of memcpy:
  union U { float f; int i; } u;
  u.f = *p;
  return u.i;


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
                   ` (3 preceding siblings ...)
  2012-02-28 12:39 ` jakub at gcc dot gnu.org
@ 2012-02-28 13:07 ` jay.foad at gmail dot com
  2012-04-05  8:13 ` mhlavink at redhat dot com
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: jay.foad at gmail dot com @ 2012-02-28 13:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

--- Comment #5 from Jay Foad <jay.foad at gmail dot com> 2012-02-28 13:03:50 UTC ---
> But to answer your question, how you can assert it is properly aligned, in gcc
> 4.7.0 you can write:
>   __builtin_memcpy (&i, __builtin_assume_aligned (p, sizeof *p), sizeof i);

Thanks! Better:
  __builtin_assume_aligned (p, __alignof__ *p)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
                   ` (4 preceding siblings ...)
  2012-02-28 13:07 ` jay.foad at gmail dot com
@ 2012-04-05  8:13 ` mhlavink at redhat dot com
  2012-04-17  8:30 ` gjl at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: mhlavink at redhat dot com @ 2012-04-05  8:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

--- Comment #6 from Michal Hlavinka <mhlavink at redhat dot com> 2012-04-05 08:12:54 UTC ---
We can see this bug on avr target too.

Following code does not work:

DirEnt            tmp = eeFs.files[i_fileId1];
eeFs.files[i_fileId1] = eeFs.files[i_fileId2];
eeFs.files[i_fileId2] = tmp;

workaround from comment #4 helps 

DirEnt tmp;
__builtin_memcpy(&tmp, __builtin_assume_aligned(&eeFs.files[i_fileId1],
sizeof(DirEnt)), sizeof(DirEnt));
...
...

So it seems it's the same bug


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
                   ` (5 preceding siblings ...)
  2012-04-05  8:13 ` mhlavink at redhat dot com
@ 2012-04-17  8:30 ` gjl at gcc dot gnu.org
  2012-04-18 12:48 ` mhlavink at redhat dot com
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-04-17  8:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |gjl at gcc dot gnu.org

--- Comment #7 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-04-17 08:28:19 UTC ---
(In reply to comment #6)
> We can see this bug on avr target too.
> 
> Following code does not work:
> 
> DirEnt            tmp = eeFs.files[i_fileId1];
> eeFs.files[i_fileId1] = eeFs.files[i_fileId2];
> eeFs.files[i_fileId2] = tmp;

Would you please post a complete test case to reproduce?
See http://gcc.gnu.org/bugs/#need


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
                   ` (6 preceding siblings ...)
  2012-04-17  8:30 ` gjl at gcc dot gnu.org
@ 2012-04-18 12:48 ` mhlavink at redhat dot com
  2012-04-18 20:02 ` gjl at gcc dot gnu.org
  2012-06-06 11:00 ` rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: mhlavink at redhat dot com @ 2012-04-18 12:48 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

--- Comment #8 from Michal Hlavinka <mhlavink at redhat dot com> 2012-04-18 12:22:34 UTC ---
Created attachment 27182
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27182
pre-processed reproducer (avr)

(In reply to comment #7)
> Would you please post a complete test case to reproduce?
> See http://gcc.gnu.org/bugs/#need

$ avr-gcc --version
avr-gcc (Fedora 4.7-0.fc17.1.20120302) 4.7.1 20120414 (prerelease)
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

(ignore old date in package name)

$ uname -ra
Linux nbone.mihlnet 3.3.0-8.fc16.x86_64 #1 SMP Thu Mar 29 18:37:19 UTC 2012
x86_64 x86_64 x86_64 GNU/Linux

command for compilation:
avr-g++ -mmcu=atmega2560  -gdwarf-2 -DF_CPU=16000000UL -O1 -Wall
-Wno-unused-variable -fno-inline-small-functions -fwhole-program testitb.cpp
--output test0.S  -S

In reproducer there are 3 options for structure swapping. First two reproduce
the problem. The last one works. This problem does not exist in avr-gcc 4.6.2

Original code:
DirEnt            tmp = eeFs.files[i_fileId1];
eeFs.files[i_fileId1] = eeFs.files[i_fileId2];
eeFs.files[i_fileId2] = tmp;
s_sync_write = true;
EeFsFlushDirEnt(i_fileId1);
EeFsFlushDirEnt(i_fileId2);

What happens:
it copies structures and then it calls EeFsFlushDirEnt but with corrupted
values.
.L__stack_usage = 4
    mov r28,r22
.LBB2:
    .loc 1 50 0
    mov r18,r24
    ldi r19,0
    movw r26,r18
    lsl r26
    rol r27
    add r26,r18
    adc r27,r19
    subi r26,lo8(-(eeFs))
    sbci r27,hi8(-(eeFs))
    adiw r26,4
    ld r24,X+
    ld r25,X+
    ld r26,X
    sbiw r26,4+2
.LVL2:
    .loc 1 51 0
    mov r30,r22
    ldi r31,0
    movw r12,r30
    lsl r12
    rol r13
    add r30,r12
    adc r31,r13
    subi r30,lo8(-(eeFs))
    sbci r31,hi8(-(eeFs))
    ldd r12,Z+4
    ldd r13,Z+5
    ldd r14,Z+6
    adiw r26,4
    st X+,r12
    st X+,r13
    st X,r14
    .loc 1 52 0
    std Z+4,r18
    std Z+5,r19
    std Z+6,r20
.LVL3:
    .loc 1 60 0
    call _ZL15EeFsFlushDirEnth
.LVL4:
    .loc 1 61 0
    mov r24,r28
    call _ZL15EeFsFlushDirEnth

First call is made with r24 that got overwritten during structure copy.
In the third structure copy code (the working one) r24 is not changed.

I'm not too good in assembler interpretation, so this is what it seems to me is
happening, but I can be wrong and there can be different problem than r24.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
                   ` (7 preceding siblings ...)
  2012-04-18 12:48 ` mhlavink at redhat dot com
@ 2012-04-18 20:02 ` gjl at gcc dot gnu.org
  2012-06-06 11:00 ` rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-04-18 20:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

--- Comment #9 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-04-18 20:00:17 UTC ---
(In reply to comment #6)
> We can see this bug on avr target too.

Thanks for the test case.

It's a bug, but completely unrelated to this one. See PR53033.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/52415] memcpy to local variable generates unnecessary stack frame for armv7
  2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
                   ` (8 preceding siblings ...)
  2012-04-18 20:02 ` gjl at gcc dot gnu.org
@ 2012-06-06 11:00 ` rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-06-06 11:00 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52415

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |DUPLICATE

--- Comment #10 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-06-06 11:00:31 UTC ---
This is a dup of PR50417

*** This bug has been marked as a duplicate of bug 50417 ***


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-06-06 11:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-28 10:40 [Bug target/52415] New: memcpy to local variable generates unnecessary stack frame for armv7 jay.foad at gmail dot com
2012-02-28 11:51 ` [Bug target/52415] " rguenth at gcc dot gnu.org
2012-02-28 12:02 ` jay.foad at gmail dot com
2012-02-28 12:17 ` rguenth at gcc dot gnu.org
2012-02-28 12:39 ` jakub at gcc dot gnu.org
2012-02-28 13:07 ` jay.foad at gmail dot com
2012-04-05  8:13 ` mhlavink at redhat dot com
2012-04-17  8:30 ` gjl at gcc dot gnu.org
2012-04-18 12:48 ` mhlavink at redhat dot com
2012-04-18 20:02 ` gjl at gcc dot gnu.org
2012-06-06 11:00 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).