public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load)
@ 2015-09-23 23:56 jwerner at chromium dot org
  2015-09-24  0:03 ` [Bug c/67701] " jwerner at chromium dot org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: jwerner at chromium dot org @ 2015-09-23 23:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701

            Bug ID: 67701
           Summary: Unnecessary/bad instructions for u32-casted access to
                    external symbol (assumes misaligned, superfluous load)
           Product: gcc
           Version: 5.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jwerner at chromium dot org
  Target Milestone: ---

Consider the following sample program:

static inline void write32(void *addr, unsigned int value) {
        *(volatile unsigned int *)addr = value;
}

extern unsigned char testvalue[];
void main() {
        write32(&testvalue, 4);
}

GCC 5.2 (with -O2 -fno-stack-protector -ffreestanding -fomit-frame-pointer)
generates:

main:
        mov     r1, #4
        mov     r2, #0
        ldr     r3, .L2
        ldrb    r0, [r3]        @ zero_extendqisi2
        strb    r1, [r3]
        ldrb    r1, [r3, #1]    @ zero_extendqisi2
        strb    r2, [r3, #1]
        ldrb    r1, [r3, #2]    @ zero_extendqisi2
        strb    r2, [r3, #2]
        ldrb    r1, [r3, #3]    @ zero_extendqisi2
        strb    r2, [r3, #3]
        bx      lr

Whereas GCC 4.9 (same arguments) generates:

main:
        movw    r3, #:lower16:testvalue
        movs    r2, #4
        movt    r3, #:upper16:testvalue
        ldr     r1, [r3]
        str     r2, [r3]
        bx      lr

I think there are two problems with this:

1. Both GCC 5.2 and 4.9 generate superfluous load instructions to |addr|, even
though the pointer is never dereferenced as an rvalue (only as the left-hand
side of an assignment). This is a) unnecessary and b) dangerous since the
pointer is declared volatile (meaning it could be an MMIO register with read
side-effects).

2. GCC 5.2 seems to somehow assume it must treat the object as unaligned and
generates byte-wise accesses. I don't understand why it would do that, since
nothing here is declared __attribute__((packed)) and the explicit cast to a
4-byte-aligned type should remove any ambiguity for the compiler about which
alignment it can assume. It is important for systems code to have a (portable)
way to dereference an address with an explicit alignment regardless of where it
came from (e.g. for MMIO registers that only support 32-bit wide accesses).

Both of these issues can be solved by using __builtin_assume_aligned(addr,
sizeof(unsigned long)), but I still think the compiler shouldn't have done this
in the first place (and having a portable solution to this problem is
preferable).

Context: http://review.coreboot.org/#/c/11698


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/67701] Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load)
  2015-09-23 23:56 [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load) jwerner at chromium dot org
@ 2015-09-24  0:03 ` jwerner at chromium dot org
  2015-09-24  0:28 ` [Bug target/67701] " pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jwerner at chromium dot org @ 2015-09-24  0:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701

Julius Werner <jwerner at chromium dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |armv7a

--- Comment #1 from Julius Werner <jwerner at chromium dot org> ---
edit: target architecture in my example is armv7a, btw (although I doubt that
this is architecture-specific)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/67701] Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load)
  2015-09-23 23:56 [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load) jwerner at chromium dot org
  2015-09-24  0:03 ` [Bug c/67701] " jwerner at chromium dot org
@ 2015-09-24  0:28 ` pinskia at gcc dot gnu.org
  2015-09-24  1:37 ` jwerner at chromium dot org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-09-24  0:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|c                           |target

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Julius Werner from comment #1)
> edit: target architecture in my example is armv7a, btw (although I doubt
> that this is architecture-specific)

I suspect this is an armv7 issue only where there is missing support for
misaligned load/stores.

Also testvalue is aligned to 1 byte by default so what GCC is doing is correct
as there is no way to do a volatile unaligned loads really.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/67701] Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load)
  2015-09-23 23:56 [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load) jwerner at chromium dot org
  2015-09-24  0:03 ` [Bug c/67701] " jwerner at chromium dot org
  2015-09-24  0:28 ` [Bug target/67701] " pinskia at gcc dot gnu.org
@ 2015-09-24  1:37 ` jwerner at chromium dot org
  2015-09-24  7:48 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jwerner at chromium dot org @ 2015-09-24  1:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701

--- Comment #3 from Julius Werner <jwerner at chromium dot org> ---
> I suspect this is an armv7 issue only where there is missing support for misaligned load/stores.

Did a quick test on aarch64 and you're right, neither of the two issues appears
there.

> Also testvalue is aligned to 1 byte by default so what GCC is doing is correct as there is no way to do a volatile unaligned loads really.

I thought alignment was always tied to a type (at least by default, unless
overridden by __attribute__((align)) or the like)? It's true that the expected
alignment for testvalue is 1, but I'm not dereferencing testvalue... I'm
dereferencing *(volatile unsigned long *)&testvalue. Shouldn't the alignment of
that always be 4? (FWIW, _Alignof(*(volatile unsigned long *)&testvalue)
evaluates to 4 in both compiler versions.)

Another argument: if write32() was defined as a non-inline function in a
separate execution unit, this problem would not appear. Should inlining a
function really change behavior in this way and produce code that is clearly
less efficient (to try to work around a "problem" that the -O0 version of the
same code would have never found)?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/67701] Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load)
  2015-09-23 23:56 [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load) jwerner at chromium dot org
                   ` (2 preceding siblings ...)
  2015-09-24  1:37 ` jwerner at chromium dot org
@ 2015-09-24  7:48 ` rguenth at gcc dot gnu.org
  2015-09-24 13:53 ` ebotcazou at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-09-24  7:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
As said in another bug GCC tries to be conservative (QOI) and fixes up
programmer mistakes in case it can see that the alignment of some memory is
clearly _not_
according to the language requirement (as here, dereferencing a int *).

In this case it sees the global 'unsigned char[]' and concludes its alignment
does not have to be larger than to 1 byte.

We could change that behavior with a flag if people like (-fstrict-alignment?).


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/67701] Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load)
  2015-09-23 23:56 [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load) jwerner at chromium dot org
                   ` (3 preceding siblings ...)
  2015-09-24  7:48 ` rguenth at gcc dot gnu.org
@ 2015-09-24 13:53 ` ebotcazou at gcc dot gnu.org
  2015-09-24 14:09 ` rguenther at suse dot de
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2015-09-24 13:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-09-24
                 CC|                            |ebotcazou at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #5 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
I wonder what the motivation to make this change was as, historically, GCC has
never tried to rescue the programmer in this clearly invalid case.  Some
obscure situation with SSE on x86?  Do other compilers do the same, e.g. on
ARM?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/67701] Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load)
  2015-09-23 23:56 [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load) jwerner at chromium dot org
                   ` (4 preceding siblings ...)
  2015-09-24 13:53 ` ebotcazou at gcc dot gnu.org
@ 2015-09-24 14:09 ` rguenther at suse dot de
  2015-09-24 15:14 ` ebotcazou at gcc dot gnu.org
  2015-09-25  7:38 ` rguenther at suse dot de
  7 siblings, 0 replies; 9+ messages in thread
From: rguenther at suse dot de @ 2015-09-24 14:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701

--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 24 Sep 2015, ebotcazou at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701
> 
> Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|UNCONFIRMED                 |NEW
>    Last reconfirmed|                            |2015-09-24
>                  CC|                            |ebotcazou at gcc dot gnu.org
>      Ever confirmed|0                           |1
> 
> --- Comment #5 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
> I wonder what the motivation to make this change was as, historically, GCC has
> never tried to rescue the programmer in this clearly invalid case.  Some
> obscure situation with SSE on x86?  Do other compilers do the same, e.g. on
> ARM?

Yes, AFAIK this was some obscure situation with SSE on x86.  IIRC
code doing unaligned scalar accesses (which is ok on x86) but then
vectorized using peeling for alignment (which cannot succeed if the
element is not naturally aligned) and segfaulting for the emitted
aligned move instructions.

Of course the pessimistic assumption we make only applies when
the compiler sees the underlying decl, thus only a fraction of
all possible cases are handled conservative that way.

Maybe these days the legacy has been cleaned up enough so we can
remove that conservative handling again...  I think it also causes
us to handle

char c[4];

int main()
{
  if (!((unsigned long)c & 3))
    return *(int *)c;
  return c[0];
}

too conservatively as we expand

  _5 = MEM[(int *)&c];

and thus lost the flow-sensitive info.

Implementation-wise you'd have to adjust get_object_alignment_2 to
get at a MEM_REF base (get_inner_reference will look through MEM_REFs
with &decl operand 0).  Eventually by adjusting get_inner_reference
itself to avoid doing the work twice.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/67701] Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load)
  2015-09-23 23:56 [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load) jwerner at chromium dot org
                   ` (5 preceding siblings ...)
  2015-09-24 14:09 ` rguenther at suse dot de
@ 2015-09-24 15:14 ` ebotcazou at gcc dot gnu.org
  2015-09-25  7:38 ` rguenther at suse dot de
  7 siblings, 0 replies; 9+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2015-09-24 15:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701

--- Comment #7 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
> Yes, AFAIK this was some obscure situation with SSE on x86.  IIRC
> code doing unaligned scalar accesses (which is ok on x86) but then
> vectorized using peeling for alignment (which cannot succeed if the
> element is not naturally aligned) and segfaulting for the emitted
> aligned move instructions.

I see, thanks for the insight.

> Maybe these days the legacy has been cleaned up enough so we can
> remove that conservative handling again...  I think it also causes
> us to handle
> 
> char c[4];
> 
> int main()
> {
>   if (!((unsigned long)c & 3))
>     return *(int *)c;
>   return c[0];
> }
> 
> too conservatively as we expand
> 
>   _5 = MEM[(int *)&c];
> 
> and thus lost the flow-sensitive info.

The problem is that, in order to fix a legitimate issue on x86, the change
pessimizes the code for strict-alignment platforms, where the said issue
doesn't exist since there are unaligned accesses in the source code.  And of
course only for them, since x86 has unaligned load/stores.  So, in the end,
this is a net loss for strict-alignment platforms.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/67701] Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load)
  2015-09-23 23:56 [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load) jwerner at chromium dot org
                   ` (6 preceding siblings ...)
  2015-09-24 15:14 ` ebotcazou at gcc dot gnu.org
@ 2015-09-25  7:38 ` rguenther at suse dot de
  7 siblings, 0 replies; 9+ messages in thread
From: rguenther at suse dot de @ 2015-09-25  7:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701

--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 24 Sep 2015, ebotcazou at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67701
> 
> --- Comment #7 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
> > Yes, AFAIK this was some obscure situation with SSE on x86.  IIRC
> > code doing unaligned scalar accesses (which is ok on x86) but then
> > vectorized using peeling for alignment (which cannot succeed if the
> > element is not naturally aligned) and segfaulting for the emitted
> > aligned move instructions.
> 
> I see, thanks for the insight.
> 
> > Maybe these days the legacy has been cleaned up enough so we can
> > remove that conservative handling again...  I think it also causes
> > us to handle
> > 
> > char c[4];
> > 
> > int main()
> > {
> >   if (!((unsigned long)c & 3))
> >     return *(int *)c;
> >   return c[0];
> > }
> > 
> > too conservatively as we expand
> > 
> >   _5 = MEM[(int *)&c];
> > 
> > and thus lost the flow-sensitive info.
> 
> The problem is that, in order to fix a legitimate issue on x86, the change
> pessimizes the code for strict-alignment platforms, where the said issue
> doesn't exist since there are unaligned accesses in the source code.  And of
> course only for them, since x86 has unaligned load/stores.  So, in the end,
> this is a net loss for strict-alignment platforms.

Agreed.  Looking at how to fix this in get_object_alignment_2 I wonder
if it makes sense to unify this function with get_inner_reference.  The
other choice would be to add a flag to get_inner_reference to stop
at MEM_REF/TARGET_MEM_REFs.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-09-25  7:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-23 23:56 [Bug c/67701] New: Unnecessary/bad instructions for u32-casted access to external symbol (assumes misaligned, superfluous load) jwerner at chromium dot org
2015-09-24  0:03 ` [Bug c/67701] " jwerner at chromium dot org
2015-09-24  0:28 ` [Bug target/67701] " pinskia at gcc dot gnu.org
2015-09-24  1:37 ` jwerner at chromium dot org
2015-09-24  7:48 ` rguenth at gcc dot gnu.org
2015-09-24 13:53 ` ebotcazou at gcc dot gnu.org
2015-09-24 14:09 ` rguenther at suse dot de
2015-09-24 15:14 ` ebotcazou at gcc dot gnu.org
2015-09-25  7:38 ` rguenther at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).