public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above
@ 2021-09-01 16:52 danglin at gcc dot gnu.org
  2021-09-01 18:52 ` [Bug tree-optimization/102162] " danglin at gcc dot gnu.org
                   ` (32 more replies)
  0 siblings, 33 replies; 34+ messages in thread
From: danglin at gcc dot gnu.org @ 2021-09-01 16:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

            Bug ID: 102162
           Summary: Byte-wise access optimized away at -O1 and above
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: danglin at gcc dot gnu.org
                CC: helge.deller at sap dot com
  Target Milestone: ---
              Host: hppa*-*-linux*
            Target: hppa*-*-linux*
             Build: hppa*-*-linux*

Created attachment 51394
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51394&action=edit
Test case

The packed attribute is used in Linux v5.14 to request byte-wise access
to unaligned data.  This is important on hppa as loads and stores require
strict alignment.

The attached test program is miscompiled at -O1 and above.  The byte-wise
accesses are optimized to a single ldw instruction during RTL expansion:

        .LEVEL 2.0w
        .text
        .align 8
.globl test
        .type   test, @function
test:
        .PROC
        .CALLINFO FRAME=0,NO_CALLS
        .ENTRY
        addil LT'output_len,%r27
        ldd RT'output_len(%r1),%r28
        ldw 0(%r28),%r28
        bve (%r2)
        extrd,s %r28,63,32,%r28
        .EXIT
        .PROCEND
        .size   test, .-test
.globl output_len
        .section        .bss
        .type   output_len, @object
        .size   output_len, 4
        .align 1
output_len:
        .block 4
        .ident  "GCC: (GNU) 10.3.0"

This faults when output_len is not aligned on a word boundary.

Not sure, but problem may be the test-unaligned.c.027t.einline pass:

;; Function get_unaligned_le32 (get_unaligned_le32, funcdef_no=0,
decl_uid=1506, cgraph_uid=1, symbol_order=1)

Iterations: 0
get_unaligned_le32 (const void * p)
{
  const struct
  {
    u32 x;
  } * __pptr;
  u32 _4;

  <bb 2> :
  __pptr_2 = p_1(D);
  _4 = __pptr_2->x;
  return _4;

}



;; Function test (test, funcdef_no=1, decl_uid=1512, cgraph_uid=2,
symbol_order=2)

Iterations: 1

Symbols to be put in SSA form
{ D.1520 D.1524 }
Incremental SSA update started at block: 0
Number of blocks in CFG: 5
Number of blocks to update: 4 ( 80%)


Merging blocks 2 and 4
Merging blocks 2 and 3
test ()
{
  u32 D.1524;
  unsigned int _1;
  unsigned int _3;
  int _4;

  <bb 2> :
  _3 = MEM[(const struct  *)&output_len].x;
  _5 = _3;
  _1 = _5;
  _4 = (int) _1;
  return _4;

}

Ultimately, the MEM gets expanded to the ldw.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
@ 2021-09-01 18:52 ` danglin at gcc dot gnu.org
  2021-09-01 20:14 ` arnd at linaro dot org
                   ` (31 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: danglin at gcc dot gnu.org @ 2021-09-01 18:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #1 from John David Anglin <danglin at gcc dot gnu.org> ---
Created attachment 51395
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51395&action=edit
Second test case

Changing the optimization of get_unaligned_le32 to 0 results in correct
code generation.  We have the following in test-unaligned.c.235t.optimized:

ave@atlas:~/linux/misc$ cat test-unaligned.c.235t.optimized

;; Function get_unaligned_le32 (get_unaligned_le32, funcdef_no=0,
decl_uid=1506, cgraph_uid=1, symbol_order=1)

__attribute__((optimize (0)))
get_unaligned_le32 (const void * p)
{
  const struct
  {
    u32 x;
  } * __pptr;
  u32 D.1517;
  u32 _4;

  <bb 2> :
  __pptr_2 = p_1(D);
  _4 = __pptr_2->x;

  <bb 3> :
<L0>:
  return _4;

}



;; Function test (test, funcdef_no=1, decl_uid=1512, cgraph_uid=2,
symbol_order=2)

test ()
{
  unsigned int _1;
  int _4;

  <bb 2> [local count: 1073741824]:
  _1 = get_unaligned_le32 (&output_len); [tail call]
  _4 = (int) _1;
  return _4;

}

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
  2021-09-01 18:52 ` [Bug tree-optimization/102162] " danglin at gcc dot gnu.org
@ 2021-09-01 20:14 ` arnd at linaro dot org
  2021-09-01 20:52 ` deller at gmx dot de
                   ` (30 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: arnd at linaro dot org @ 2021-09-01 20:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

Arnd Bergmann <arnd at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |arnd at linaro dot org

--- Comment #2 from Arnd Bergmann <arnd at linaro dot org> ---
I tried reproducing the issue with my original kernel code, using this input:

typedef unsigned u32;
#define __packed __attribute__((packed))

#define __get_unaligned_t(type, ptr) ({                                        
\
        const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr);     
\
        __pptr->x;                                                             
\
}) 

#define get_unaligned(ptr)      __get_unaligned_t(typeof(*(ptr)), (ptr))

int f_unaligned(u32 *p)
{ 
     return get_unaligned(p); 
}

int g(u32 *p) 
{ 
     return *(p); 
}

and it looks like I get correct output:

hppa64-linux-gcc -S kernel/test_unaligned.c -o - -O2
        .LEVEL 2.0w
        .text
        .align 8
.globl f_unaligned
        .type   f_unaligned, @function
f_unaligned:
        .PROC
        .CALLINFO FRAME=0,NO_CALLS
        .ENTRY
        ldb 0(%r26),%r20
        ldb 1(%r26),%r19
        depd,z %r20,39,40,%r20
        depd,z %r19,47,48,%r19
        ldb 2(%r26),%r31
        ldb 3(%r26),%r28
        or %r19,%r20,%r19
        depd,z %r31,55,56,%r31
        or %r31,%r19,%r31
        or %r28,%r31,%r28
        bve (%r2)
        extrd,s %r28,63,32,%r28
        .EXIT
        .PROCEND
        .size   f_unaligned, .-f_unaligned
        .align 8
.globl g
        .type   g, @function
g:
        .PROC
        .CALLINFO FRAME=0,NO_CALLS
        .ENTRY
        ldw 0(%r26),%r28
        bve (%r2)
        extrd,s %r28,63,32,%r28
        .EXIT
        .PROCEND
        .size   g, .-g
        .ident  "GCC: (GNU) 11.1.0"

Any idea what the difference is between the working version and your broken
one?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
  2021-09-01 18:52 ` [Bug tree-optimization/102162] " danglin at gcc dot gnu.org
  2021-09-01 20:14 ` arnd at linaro dot org
@ 2021-09-01 20:52 ` deller at gmx dot de
  2021-09-01 20:56 ` dave.anglin at bell dot net
                   ` (29 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: deller at gmx dot de @ 2021-09-01 20:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #3 from deller at gmx dot de ---
Hi Arnd,

I think the problem with your testcase is, that the compiler doesn't know the 
alignment of the parameter "p" in your f_unaligned() function.
So it will generate byte-accesses.

If you modify your testcase by adding this and compiling with -O1 (or higher)
you see the problem:

int evil;
int f_unaligned2(void)
{
     return get_unaligned(&evil);
}

00000000 <f_unaligned2>:
   0:   2b 60 00 00     addil L%0,dp,r1
   4:   34 21 00 00     ldo 0(r1),r1
   8:   e8 40 c0 00     bv r0(rp)
   c:   0c 20 10 9c     ldw 0(r1),ret0

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-09-01 20:52 ` deller at gmx dot de
@ 2021-09-01 20:56 ` dave.anglin at bell dot net
  2021-09-01 21:08 ` dave.anglin at bell dot net
                   ` (28 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: dave.anglin at bell dot net @ 2021-09-01 20:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #4 from dave.anglin at bell dot net ---
On 2021-09-01 4:14 p.m., arnd at linaro dot org wrote:
> Any idea what the difference is between the working version and your broken
> one?
Not really.  My original test case worked as well.  Helge created the broken
one.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-09-01 20:56 ` dave.anglin at bell dot net
@ 2021-09-01 21:08 ` dave.anglin at bell dot net
  2021-09-01 21:15 ` deller at gmx dot de
                   ` (27 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: dave.anglin at bell dot net @ 2021-09-01 21:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #5 from dave.anglin at bell dot net ---
On 2021-09-01 4:52 p.m., deller at gmx dot de wrote:
> I think the problem with your testcase is, that the compiler doesn't know the 
> alignment of the parameter "p" in your f_unaligned() function.
> So it will generate byte-accesses.
So, it seems the __aligned__ attribute is ignored:
extern u32 output_len __attribute__((__aligned__(1)));

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-09-01 21:08 ` dave.anglin at bell dot net
@ 2021-09-01 21:15 ` deller at gmx dot de
  2021-09-01 21:19 ` dave.anglin at bell dot net
                   ` (26 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: deller at gmx dot de @ 2021-09-01 21:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #6 from deller at gmx dot de ---
> So, it seems the __aligned__ attribute is ignored:
> extern u32 output_len __attribute__((__aligned__(1)));

I think the aligned attribute is not relevant here. Even
        u32 output_len;
will generate word-accesses.
I'd say that the "forcement-to-packed" is ignored
when the compiler knows that the source is aligned.
The "__attribute__((__packed__))" should *always* trigger byte-accesses.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-09-01 21:15 ` deller at gmx dot de
@ 2021-09-01 21:19 ` dave.anglin at bell dot net
  2021-09-01 21:25 ` deller at gmx dot de
                   ` (25 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: dave.anglin at bell dot net @ 2021-09-01 21:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #7 from dave.anglin at bell dot net ---
On 2021-09-01 4:52 p.m., deller at gmx dot de wrote:
> I think the problem with your testcase is, that the compiler doesn't know the 
> alignment of the parameter "p" in your f_unaligned() function.
> So it will generate byte-accesses.
I think it's the type rather than the alignment.  If type is char, one gets
byte accesses.  If
type is short, one gets 16-bit accesses.

The alignment is being ignored.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-09-01 21:19 ` dave.anglin at bell dot net
@ 2021-09-01 21:25 ` deller at gmx dot de
  2021-09-01 21:29 ` deller at gmx dot de
                   ` (24 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: deller at gmx dot de @ 2021-09-01 21:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #8 from deller at gmx dot de ---
On 9/1/21 11:19 PM, dave.anglin at bell dot net wrote:
>> I think the problem with your testcase is, that the compiler doesn't know the
>> alignment of the parameter "p" in your f_unaligned() function.
>> So it will generate byte-accesses.
> I think it's the type rather than the alignment.  If type is char, one gets
> byte accesses.  If type is short, one gets 16-bit accesses.
>
> The alignment is being ignored.

You are right.
It's even worse!

short evil;
int f_unaligned2(void)
{ return get_unaligned(&evil); }

gives:
00000000 <f_unaligned2>:
    0:   2b 60 00 00     addil L%0,dp,r1
    4:   44 3c 00 00     ldh 0(r1),ret0
    8:   e8 40 c0 00     bv r0(rp)
    c:   d3 9c 1f f0     extrw,s ret0,31,16,ret0

The "ldh" loads only the first two bytes, and extends it into the upper 32bits
with "extrw,s".
So, only 16bits instead of 32bits are loaded from the address where "evil"
is...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2021-09-01 21:25 ` deller at gmx dot de
@ 2021-09-01 21:29 ` deller at gmx dot de
  2021-09-01 21:48 ` pinskia at gcc dot gnu.org
                   ` (23 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: deller at gmx dot de @ 2021-09-01 21:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #9 from deller at gmx dot de ---
On 9/1/21 11:25 PM, deller at gmx dot de wrote:
> The "ldh" loads only the first two bytes, and extends it into the upper 32bits
> with "extrw,s".
> So, only 16bits instead of 32bits are loaded from the address where "evil" is...

Forget this!
My testcase was wrong. Here is the correct testcase which then loads 32bits:

short evil;
int f_unaligned2(void)
{ return get_unaligned((unsigned long *)&evil); }

00000000 <f_unaligned2>:
    0:   2b 60 00 00     addil L%0,dp,r1
    4:   34 33 00 00     ldo 0(r1),r19
    8:   44 3c 00 00     ldh 0(r1),ret0
    c:   d7 9c 0a 10     depw,z ret0,15,16,ret0
   10:   0e 64 10 53     ldh 2(r19),r19
   14:   e8 40 c0 00     bv r0(rp)
   18:   0b 93 02 5c     or r19,ret0,ret0

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2021-09-01 21:29 ` deller at gmx dot de
@ 2021-09-01 21:48 ` pinskia at gcc dot gnu.org
  2021-09-01 21:51 ` [Bug middle-end/102162] " pinskia at gcc dot gnu.org
                   ` (22 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-01 21:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So I looked into this a little bit and it works on aarch64 with -O1
-mstrict-align but if you remove -mstrict-align we get an unaligned access
which I think it is expected.
The gimple level is the same in both cases, it is expand which changes.

Does hppa*-*-linux* have STRICT_ALIGNMENT set to true or false?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2021-09-01 21:48 ` pinskia at gcc dot gnu.org
@ 2021-09-01 21:51 ` pinskia at gcc dot gnu.org
  2021-09-01 21:52 ` pinskia at gcc dot gnu.org
                   ` (21 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-01 21:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|tree-optimization           |middle-end
           Keywords|                            |wrong-code

--- Comment #11 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #10) 
> Does hppa*-*-linux* have STRICT_ALIGNMENT set to true or false?

config/pa/pa.h:#define STRICT_ALIGNMENT 1

Hmm, so it should work.
It is definitely something in the expansion between gimple and rtl which is
messing up.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2021-09-01 21:51 ` [Bug middle-end/102162] " pinskia at gcc dot gnu.org
@ 2021-09-01 21:52 ` pinskia at gcc dot gnu.org
  2021-09-01 22:35 ` dave.anglin at bell dot net
                   ` (20 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-01 21:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Here is what the first testcase looks like at -O1 -mstrict-align on
aarch64-linux-gnu for GCC 10.3.0:
test:
.LFB1:
        .cfi_startproc
        adrp    x0, output_len
        add     x1, x0, :lo12:output_len
        ldrb    w2, [x0, #:lo12:output_len]
        ldrb    w0, [x1, 1]
        orr     x2, x2, x0, lsl 8
        ldrb    w0, [x1, 2]
        orr     x0, x2, x0, lsl 16
        ldrb    w1, [x1, 3]
        orr     w0, w0, w1, lsl 24
        ret
        .cfi_endproc
.LFE1:
        .size   test, .-test
        .ident  "GCC: (GNU) 10.3.0"
        .section        .note.GNU-stack,"",@progbits

This is doing the correct thing in splitting up the load into bytes loads.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2021-09-01 21:52 ` pinskia at gcc dot gnu.org
@ 2021-09-01 22:35 ` dave.anglin at bell dot net
  2021-09-01 22:46 ` dave.anglin at bell dot net
                   ` (19 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: dave.anglin at bell dot net @ 2021-09-01 22:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #13 from dave.anglin at bell dot net ---
On 2021-09-01 5:52 p.m., pinskia at gcc dot gnu.org wrote:
> This is doing the correct thing in splitting up the load into bytes loads.
We only get correct code at -O0.  STRICT_ALIGNMENT is defined to 1.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2021-09-01 22:35 ` dave.anglin at bell dot net
@ 2021-09-01 22:46 ` dave.anglin at bell dot net
  2021-09-01 22:56 ` pinskia at gcc dot gnu.org
                   ` (18 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: dave.anglin at bell dot net @ 2021-09-01 22:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #14 from dave.anglin at bell dot net ---
On 2021-09-01 6:35 p.m., dave.anglin at bell dot net wrote:
> We only get correct code at -O0.
Maybe cpymemsi expander is problem.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2021-09-01 22:46 ` dave.anglin at bell dot net
@ 2021-09-01 22:56 ` pinskia at gcc dot gnu.org
  2021-09-01 23:19 ` pinskia at gcc dot gnu.org
                   ` (17 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-01 22:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The trunk works:
        .LEVEL 1.1
        .text
        .align 4
.globl test
        .type   test, @function
test:
        .PROC
        .CALLINFO FRAME=0,NO_CALLS
        .ENTRY
        addil LR'output_len-$global$,%r27
        ldo RR'output_len-$global$(%r1),%r20
        ldb RR'output_len-$global$(%r1),%r28
        zdep %r28,7,8,%r28
        ldb 1(%r20),%r19
        zdep %r19,15,16,%r19
        or %r19,%r28,%r19
        ldb 2(%r20),%r28
        zdep %r28,23,24,%r28
        or %r28,%r19,%r28
        ldb 3(%r20),%r19
        bv %r0(%r2)
        or %r19,%r28,%r28
        .EXIT
        .PROCEND
        .size   test, .-test
        .ident  "GCC: (GNU) 12.0.0 20210901 (experimental)"

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2021-09-01 22:56 ` pinskia at gcc dot gnu.org
@ 2021-09-01 23:19 ` pinskia at gcc dot gnu.org
  2021-09-01 23:21 ` pinskia at gcc dot gnu.org
                   ` (16 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-01 23:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #16 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I cannot even reproduce the original issue on released gcc 10.3.0 sources.
What configure options is being used? I used none except for --target:
Configured with: ../configure --target=hppa-linux-gnu

I even tried with  -march=2.0 and it still works.
Looks like the target really is hppa*64-linux-gnu :)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2021-09-01 23:19 ` pinskia at gcc dot gnu.org
@ 2021-09-01 23:21 ` pinskia at gcc dot gnu.org
  2021-09-01 23:29 ` pinskia at gcc dot gnu.org
                   ` (15 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-01 23:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #17 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to dave.anglin from comment #14)
> On 2021-09-01 6:35 p.m., dave.anglin at bell dot net wrote:
> > We only get correct code at -O0.
> Maybe cpymemsi expander is problem.

It can't be as that is only used for !TARGET_64BIT and this is a TARGET_64BIT
as obvious by "LEVEL 2.0w".

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2021-09-01 23:21 ` pinskia at gcc dot gnu.org
@ 2021-09-01 23:29 ` pinskia at gcc dot gnu.org
  2021-09-01 23:45 ` pinskia at gcc dot gnu.org
                   ` (14 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-01 23:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-09-01
     Ever confirmed|0                           |1

--- Comment #18 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I used noticed the original testcase had the wrong line commented out :)
It should have been:
extern u32  output_len __attribute__((__aligned__(1)));

Anyways confirmed on aarch64-linux-gnu with -O1 -mstrict-align too.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2021-09-01 23:29 ` pinskia at gcc dot gnu.org
@ 2021-09-01 23:45 ` pinskia at gcc dot gnu.org
  2021-09-01 23:55 ` pinskia at gcc dot gnu.org
                   ` (13 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-01 23:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #19 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Gimple level does look correct:
 <component_ref 0x7ffff7359300
    type <integer_type 0x7ffff7315b28 u32 readonly unsigned SI
        size <integer_cst 0x7ffff7244b40 constant 32>
        unit-size <integer_cst 0x7ffff7244b58 constant 4>
        align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff7315bd0 precision:32 min <integer_cst 0x7ffff7244b70 0> max <integer_cst
0x7ffff7244b28 4294967295> context <translation_unit_decl 0x7ffff733e870 t.c>>
    readonly
    arg:0 <mem_ref 0x7ffff735b2f8
        type <record_type 0x7ffff73159d8 readonly no-force-blk packed type_0
BLK size <integer_cst 0x7ffff7244b40 32> unit-size <integer_cst 0x7ffff7244b58
4>
            align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff73159d8
            attributes <tree_list 0x7ffff732bd20
                purpose <identifier_node 0x7ffff7344690 packed>> fields
<field_decl 0x7ffff725ebe0 x> context <block 0x7ffff7338420>
            pointer_to_this <pointer_type 0x7ffff7315a80>>

        arg:0 <addr_expr 0x7ffff73326e0 type <pointer_type 0x7ffff7315888>
            constant arg:0 <var_decl 0x7ffff7ff6120 output_len>
            t.c:17:9 start: t.c:17:9 finish: t.c:17:39>
        arg:1 <integer_cst 0x7ffff733bf00 constant 0>>
    arg:1 <field_decl 0x7ffff725ebe0 x
        type <integer_type 0x7ffff7315738 u32 unsigned SI size <integer_cst
0x7ffff7244b40 32> unit-size <integer_cst 0x7ffff7244b58 4>
            align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff7251690 precision:32 min <integer_cst 0x7ffff7244b70 0> max <integer_cst
0x7ffff7244b28 4294967295> context <translation_unit_decl 0x7ffff733e870 t.c>
            pointer_to_this <pointer_type 0x7ffff7315888>>
        unsigned packed SI t.c:12:33 size <integer_cst 0x7ffff7244b40 32>
unit-size <integer_cst 0x7ffff7244b58 4>
        align:8 warn_if_not_align:0 offset_align 128
        offset <integer_cst 0x7ffff7244930 constant 0>
        bit-offset <integer_cst 0x7ffff7244948 constant 0> context <record_type
0x7ffff7315930>>
    t.c:12:103 start: t.c:12:97 finish: t.c:12:105>

The var_decl too:
(gdb) p debug_tree(0x7ffff7ff6120)
 <var_decl 0x7ffff7ff6120 output_len
    type <integer_type 0x7ffff7315738 u32 unsigned SI
        size <integer_cst 0x7ffff7244b40 constant 32>
        unit-size <integer_cst 0x7ffff7244b58 constant 4>
        align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff7251690 precision:32 min <integer_cst 0x7ffff7244b70 0> max <integer_cst
0x7ffff7244b28 4294967295> context <translation_unit_decl 0x7ffff733e870 t.c>
        pointer_to_this <pointer_type 0x7ffff7315888>>
    addressable used public unsigned external read SI t.c:6:14 size
<integer_cst 0x7ffff7244b40 32> unit-size <integer_cst 0x7ffff7244b58 4>
    user align:8 warn_if_not_align:0 context <translation_unit_decl
0x7ffff733e870 t.c>
    attributes <tree_list 0x7ffff732bc30
        purpose <identifier_node 0x7ffff7344500 aligned>
        value <tree_list 0x7ffff732bbb8
            value <integer_cst 0x7ffff7244cf0 constant 1>>> chain
<function_decl 0x7ffff7336a00 test>>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2021-09-01 23:45 ` pinskia at gcc dot gnu.org
@ 2021-09-01 23:55 ` pinskia at gcc dot gnu.org
  2021-09-02  0:02 ` dave.anglin at bell dot net
                   ` (12 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-01 23:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #20 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
tem was the var_decl
        /* If TEM's type is a union of variable size, pass TARGET to the inner
           computation, since it will need a temporary and TARGET is known
           to have to do.  This occurs in unchecked conversion in Ada.  */
        orig_op0 = op0
          = expand_expr_real (tem,
                              (TREE_CODE (TREE_TYPE (tem)) == UNION_TYPE
                               && COMPLETE_TYPE_P (TREE_TYPE (tem))
                               && (TREE_CODE (TYPE_SIZE (TREE_TYPE (tem)))
                                   != INTEGER_CST)
                               && modifier != EXPAND_STACK_PARM
                               ? target : NULL_RTX),
                              VOIDmode,
                              modifier == EXPAND_SUM ? EXPAND_NORMAL :
modifier,
                              NULL, true);
produces:
(gdb) p debug_rtx(op0)
(mem/c:SI (reg/f:DI 71) [1 output_len+0 S4 A32])

Note the A32 here.

So it is a bug in the expansion of the var_decl.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (19 preceding siblings ...)
  2021-09-01 23:55 ` pinskia at gcc dot gnu.org
@ 2021-09-02  0:02 ` dave.anglin at bell dot net
  2021-09-02  0:19 ` pinskia at gcc dot gnu.org
                   ` (11 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: dave.anglin at bell dot net @ 2021-09-02  0:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #21 from dave.anglin at bell dot net ---
On 2021-09-01 7:21 p.m., pinskia at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162
>
> --- Comment #17 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> (In reply to dave.anglin from comment #14)
>> On 2021-09-01 6:35 p.m., dave.anglin at bell dot net wrote:
>>> We only get correct code at -O0.
>> Maybe cpymemsi expander is problem.
> It can't be as that is only used for !TARGET_64BIT and this is a TARGET_64BIT
> as obvious by "LEVEL 2.0w".
I changed expanders for both !TARGET_64BIT and TARGET_64BIT.  Didn't help. 
Same error with trunk.

Dave

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (20 preceding siblings ...)
  2021-09-02  0:02 ` dave.anglin at bell dot net
@ 2021-09-02  0:19 ` pinskia at gcc dot gnu.org
  2021-09-02  0:23 ` pinskia at gcc dot gnu.org
                   ` (10 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02  0:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #22 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The problem is in emit-rtl.c in set_mem_attributes_minus_bitpos:

  /* We can set the alignment from the type if we are making an object or if
     this is an INDIRECT_REF.  */
  if (objectp || TREE_CODE (t) == INDIRECT_REF)
    attrs.align = MAX (attrs.align, TYPE_ALIGN (type));


The type here is not the correct thing to do.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (21 preceding siblings ...)
  2021-09-02  0:19 ` pinskia at gcc dot gnu.org
@ 2021-09-02  0:23 ` pinskia at gcc dot gnu.org
  2021-09-02  0:47 ` dave.anglin at bell dot net
                   ` (9 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02  0:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #23 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #22)
> The problem is in emit-rtl.c in set_mem_attributes_minus_bitpos:
> 
>   /* We can set the alignment from the type if we are making an object or if
>      this is an INDIRECT_REF.  */
>   if (objectp || TREE_CODE (t) == INDIRECT_REF)
>     attrs.align = MAX (attrs.align, TYPE_ALIGN (type));
> 
> 
> The type here is not the correct thing to do.

This has been a bug since r0-38512 (2001).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (22 preceding siblings ...)
  2021-09-02  0:23 ` pinskia at gcc dot gnu.org
@ 2021-09-02  0:47 ` dave.anglin at bell dot net
  2021-09-02  0:49 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: dave.anglin at bell dot net @ 2021-09-02  0:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #24 from dave.anglin at bell dot net ---
On 2021-09-01 8:23 p.m., pinskia at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162
>
> --- Comment #23 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> (In reply to Andrew Pinski from comment #22)
>> The problem is in emit-rtl.c in set_mem_attributes_minus_bitpos:
>>
>>   /* We can set the alignment from the type if we are making an object or if
>>      this is an INDIRECT_REF.  */
>>   if (objectp || TREE_CODE (t) == INDIRECT_REF)
>>     attrs.align = MAX (attrs.align, TYPE_ALIGN (type));
>>
>>
>> The type here is not the correct thing to do.
> This has been a bug since r0-38512 (2001).
Excellent work!  I assume attrs.align should only be set from type when it is
not set.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (23 preceding siblings ...)
  2021-09-02  0:47 ` dave.anglin at bell dot net
@ 2021-09-02  0:49 ` pinskia at gcc dot gnu.org
  2021-09-02  3:33 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02  0:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |88085

--- Comment #25 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
PR 88085 is the same bug.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88085
[Bug 88085] User alignments on var decls not respected if smaller than type
alignment

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (24 preceding siblings ...)
  2021-09-02  0:49 ` pinskia at gcc dot gnu.org
@ 2021-09-02  3:33 ` pinskia at gcc dot gnu.org
  2021-09-02  7:12 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02  3:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|NEW                         |RESOLVED

--- Comment #26 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Just marking this as a dup of bug 88085.

The workaround is do this:
typedef unsigned int u32a1  __attribute__((__aligned__(1)));

 extern u32a1  output_len __attribute__((__aligned__(1)));

*** This bug has been marked as a duplicate of bug 88085 ***

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (25 preceding siblings ...)
  2021-09-02  3:33 ` pinskia at gcc dot gnu.org
@ 2021-09-02  7:12 ` rguenth at gcc dot gnu.org
  2021-09-02  9:01 ` arnd at linaro dot org
                   ` (5 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-02  7:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162
Bug 102162 depends on bug 88085, which changed state.

Bug 88085 Summary: User alignments on var decls not respected if smaller than type alignment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88085

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (26 preceding siblings ...)
  2021-09-02  7:12 ` rguenth at gcc dot gnu.org
@ 2021-09-02  9:01 ` arnd at linaro dot org
  2021-09-02  9:41 ` deller at gmx dot de
                   ` (4 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: arnd at linaro dot org @ 2021-09-02  9:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #27 from Arnd Bergmann <arnd at linaro dot org> ---
The linux kernel instance from arch/parisc/ looks like a bug we fixed in
arch/arm a few years ago, by adding the required alignment directive to the
linker script.

If changing the linker script is not possible because of boot loader
requirements, then this should do as well:

diff --git a/arch/parisc/boot/compressed/misc.c
b/arch/parisc/boot/compressed/misc.c
index 2d395998f524..b91d6cf80c06 100644
--- a/arch/parisc/boot/compressed/misc.c
+++ b/arch/parisc/boot/compressed/misc.c
@@ -26,7 +26,7 @@
 extern char input_data[];
 extern int input_len;
 /* output_len is inserted by the linker possibly at an unaligned address */
-extern __le32 output_len __aligned(1);
+extern struct { __u8 bytes; } output_len;
 extern char _text, _end;
 extern char _bss, _ebss;
 extern char _startcode_end;

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (27 preceding siblings ...)
  2021-09-02  9:01 ` arnd at linaro dot org
@ 2021-09-02  9:41 ` deller at gmx dot de
  2021-09-02  9:52 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: deller at gmx dot de @ 2021-09-02  9:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #28 from deller at gmx dot de ---
Arnd,
there are various calls to the get_unaligned_X() functions in all kernel
bootloaders, specifically in the kernel decompression routines: 
[deller@ls3530 linux-2.6]$ grep get_unaligned lib/decompress*
lib/decompress_unlz4.c: size_t out_len = get_unaligned_le32(input + in_len);
lib/decompress_unlz4.c: chunksize = get_unaligned_le32(inp);
lib/decompress_unlz4.c:         chunksize = get_unaligned_le32(inp);
lib/decompress_unlzo.c: version = get_unaligned_be16(parse);
lib/decompress_unlzo.c: if (get_unaligned_be32(parse) & HEADER_HAS_FILTER)
lib/decompress_unlzo.c:         dst_len = get_unaligned_be32(in_buf);
lib/decompress_unlzo.c:         src_len = get_unaligned_be32(in_buf);

So sadly it's not possible to work around that cases with linker scripts,
because they work on externally generated compressed files (kernel code) for
which the specs of the compressed files can't be changed.
Same for the output_len variable - externally linked in directly behind the
code and not (easily?) changeable.
Helge

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (28 preceding siblings ...)
  2021-09-02  9:41 ` deller at gmx dot de
@ 2021-09-02  9:52 ` pinskia at gcc dot gnu.org
  2021-09-02 13:59 ` deller at gmx dot de
                   ` (2 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #29 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to deller from comment #28)
> Arnd,
> there are various calls to the get_unaligned_X() functions in all kernel
> bootloaders, specifically in the kernel decompression routines: 

get_unaligned_ function is fine and working correctly.  It is only the
declarations of output_len (and like declarations) which problematic.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (29 preceding siblings ...)
  2021-09-02  9:52 ` pinskia at gcc dot gnu.org
@ 2021-09-02 13:59 ` deller at gmx dot de
  2021-09-02 14:00 ` deller at gmx dot de
  2021-09-03 23:25 ` deller at gmx dot de
  32 siblings, 0 replies; 34+ messages in thread
From: deller at gmx dot de @ 2021-09-02 13:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #30 from deller at gmx dot de ---
Created attachment 51405
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51405&action=edit
Linux kernel patch to add compiler optimization barrier

Linux kernel boots sucessfully with this patch on hppa.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (30 preceding siblings ...)
  2021-09-02 13:59 ` deller at gmx dot de
@ 2021-09-02 14:00 ` deller at gmx dot de
  2021-09-03 23:25 ` deller at gmx dot de
  32 siblings, 0 replies; 34+ messages in thread
From: deller at gmx dot de @ 2021-09-02 14:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #31 from deller at gmx dot de ---
Richard suggested that adding a compiler optimization barrier (__asm__ ("" :
"+r" (__pptr))) might fix the problem.
I tested the attached patch and it works nicely.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug middle-end/102162] Byte-wise access optimized away at -O1 and above
  2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
                   ` (31 preceding siblings ...)
  2021-09-02 14:00 ` deller at gmx dot de
@ 2021-09-03 23:25 ` deller at gmx dot de
  32 siblings, 0 replies; 34+ messages in thread
From: deller at gmx dot de @ 2021-09-03 23:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162

--- Comment #32 from deller at gmx dot de ---
Fixed in Linux kernel by declaring the extern int32 as char:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c42813b71a06a2ff4a155aa87ac609feeab76cf3

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2021-09-03 23:25 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-01 16:52 [Bug tree-optimization/102162] New: Byte-wise access optimized away at -O1 and above danglin at gcc dot gnu.org
2021-09-01 18:52 ` [Bug tree-optimization/102162] " danglin at gcc dot gnu.org
2021-09-01 20:14 ` arnd at linaro dot org
2021-09-01 20:52 ` deller at gmx dot de
2021-09-01 20:56 ` dave.anglin at bell dot net
2021-09-01 21:08 ` dave.anglin at bell dot net
2021-09-01 21:15 ` deller at gmx dot de
2021-09-01 21:19 ` dave.anglin at bell dot net
2021-09-01 21:25 ` deller at gmx dot de
2021-09-01 21:29 ` deller at gmx dot de
2021-09-01 21:48 ` pinskia at gcc dot gnu.org
2021-09-01 21:51 ` [Bug middle-end/102162] " pinskia at gcc dot gnu.org
2021-09-01 21:52 ` pinskia at gcc dot gnu.org
2021-09-01 22:35 ` dave.anglin at bell dot net
2021-09-01 22:46 ` dave.anglin at bell dot net
2021-09-01 22:56 ` pinskia at gcc dot gnu.org
2021-09-01 23:19 ` pinskia at gcc dot gnu.org
2021-09-01 23:21 ` pinskia at gcc dot gnu.org
2021-09-01 23:29 ` pinskia at gcc dot gnu.org
2021-09-01 23:45 ` pinskia at gcc dot gnu.org
2021-09-01 23:55 ` pinskia at gcc dot gnu.org
2021-09-02  0:02 ` dave.anglin at bell dot net
2021-09-02  0:19 ` pinskia at gcc dot gnu.org
2021-09-02  0:23 ` pinskia at gcc dot gnu.org
2021-09-02  0:47 ` dave.anglin at bell dot net
2021-09-02  0:49 ` pinskia at gcc dot gnu.org
2021-09-02  3:33 ` pinskia at gcc dot gnu.org
2021-09-02  7:12 ` rguenth at gcc dot gnu.org
2021-09-02  9:01 ` arnd at linaro dot org
2021-09-02  9:41 ` deller at gmx dot de
2021-09-02  9:52 ` pinskia at gcc dot gnu.org
2021-09-02 13:59 ` deller at gmx dot de
2021-09-02 14:00 ` deller at gmx dot de
2021-09-03 23:25 ` deller at gmx dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).