public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/106818] New: code is genereated differently with or without 'extern'
@ 2022-09-02 19:26 pangbw at gmail dot com
  2022-09-02 19:28 ` [Bug c/106818] " pangbw at gmail dot com
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: pangbw at gmail dot com @ 2022-09-02 19:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

            Bug ID: 106818
           Summary: code is genereated differently with or without
                    'extern'
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pangbw at gmail dot com
  Target Milestone: ---

Just wondering why GCC would generate such different code:

https://godbolt.org/z/ncE5sWYe8

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
@ 2022-09-02 19:28 ` pangbw at gmail dot com
  2022-09-02 19:37 ` pinskia at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pangbw at gmail dot com @ 2022-09-02 19:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #1 from baoshan <pangbw at gmail dot com> ---
With 'extern', four 'sb' are ued to store value into "p->i";
while without 'extern', only one 'sw' is used.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
  2022-09-02 19:28 ` [Bug c/106818] " pangbw at gmail dot com
@ 2022-09-02 19:37 ` pinskia at gcc dot gnu.org
  2022-09-02 19:38 ` [Bug middle-end/106818] " pinskia at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-09-02 19:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Most likely known alignment or not.
Riscv targets are sensitive to alignment.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug middle-end/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
  2022-09-02 19:28 ` [Bug c/106818] " pangbw at gmail dot com
  2022-09-02 19:37 ` pinskia at gcc dot gnu.org
@ 2022-09-02 19:38 ` pinskia at gcc dot gnu.org
  2022-09-02 20:03 ` palmer at gcc dot gnu.org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-09-02 19:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Please attach or paste the testcase into the bug report instead of a godbolt
link too.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug middle-end/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (2 preceding siblings ...)
  2022-09-02 19:38 ` [Bug middle-end/106818] " pinskia at gcc dot gnu.org
@ 2022-09-02 20:03 ` palmer at gcc dot gnu.org
  2022-09-02 22:29 ` pangbw at gmail dot com
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: palmer at gcc dot gnu.org @ 2022-09-02 20:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

palmer at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |palmer at gcc dot gnu.org

--- Comment #4 from palmer at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #2)
> Most likely known alignment or not.
> Riscv targets are sensitive to alignment.

Not sure I'm allowed to paste the code in for them, but that's what's going on
here: with -mtune=thead-c906 both cases have a single store, the default is for
Rocket which has very slow misaligned accesses.

That said, I think we actually have a bug here: if the extern symbol was really
of unknown alignment then sharing the lui might not work.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug middle-end/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (3 preceding siblings ...)
  2022-09-02 20:03 ` palmer at gcc dot gnu.org
@ 2022-09-02 22:29 ` pangbw at gmail dot com
  2022-09-02 22:33 ` pangbw at gmail dot com
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pangbw at gmail dot com @ 2022-09-02 22:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #5 from baoshan <pangbw at gmail dot com> ---
Per Andrew's request:

For GCC built for RISC-V,
With the following code:
struct sss_t {
    int i;
    int j;
} sss;
extern char array[sizeof(struct sss_t )];
void foo()
{
    struct sss_t *p = (struct sss_t *)array;
    p->i = 10;
}

The following asm is generated:
foo():
        lui     a5,%hi(array)
        li      a4,10
        sb      a4,%lo(array)(a5)
        sb      zero,%lo(array+1)(a5)
        sb      zero,%lo(array+2)(a5)
        sb      zero,%lo(array+3)(a5)
        ret
sss:
        .zero   8

While if remove the 'extern' from the C code, the following asm is generated:

foo():
        lui     a5,%hi(array)
        li      a4,10
        sw      a4,%lo(array)(a5)
        ret
array:
        .zero   8
sss:
        .zero   8

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug middle-end/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (4 preceding siblings ...)
  2022-09-02 22:29 ` pangbw at gmail dot com
@ 2022-09-02 22:33 ` pangbw at gmail dot com
  2022-09-02 22:37 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pangbw at gmail dot com @ 2022-09-02 22:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #6 from baoshan <pangbw at gmail dot com> ---
> really of unknown alignment then sharing the lui might not work.
Can you elaborate why shareing the lui might not work?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug middle-end/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (5 preceding siblings ...)
  2022-09-02 22:33 ` pangbw at gmail dot com
@ 2022-09-02 22:37 ` pinskia at gcc dot gnu.org
  2022-09-02 23:59 ` palmer at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-09-02 22:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to baoshan from comment #6)
> > really of unknown alignment then sharing the lui might not work.
> Can you elaborate why shareing the lui might not work?

Linker relaxation not coming in and relaxing it to be use gp offsets instead.
It is one of the worst parts of the riscv toolchain ...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug middle-end/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (6 preceding siblings ...)
  2022-09-02 22:37 ` pinskia at gcc dot gnu.org
@ 2022-09-02 23:59 ` palmer at gcc dot gnu.org
  2022-09-05 10:15 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: palmer at gcc dot gnu.org @ 2022-09-02 23:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #8 from palmer at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #7)
> (In reply to baoshan from comment #6)
> > > really of unknown alignment then sharing the lui might not work.
> > Can you elaborate why shareing the lui might not work?

Unless I've managed to screw up some bit arithmetic here, it's just overflow
that we're not detecting at link time:

$ cat test.c 
extern char glob[4];

int _start(void) {
        int *i = (int *)glob;
        return *i;
}
$ cat glob.s 
.section .sdata
.balign 4096
.global empty
empty:
.rep 2046
.byte 0
.endr
.global glob
glob:
.byte 1, 2, 3, 4
$ riscv64-linux-gnu-gcc test.c glob.s -O3 -o test -static -fno-PIE
-mcmodel=medlow -mexplicit-relocs -nostdlib
$ riscv64-linux-gnu-objdump -d test
...
000000000001010c <_start>:
   1010c:       66c9                    lui     a3,0x12
   1010e:       7ff6c703                lbu     a4,2047(a3) # 127ff <glob+0x1>
   10112:       7fe6c603                lbu     a2,2046(a3)
   10116:       8006c783                lbu     a5,-2048(a3)
   1011a:       8016c503                lbu     a0,-2047(a3)
...

So that's going to load

a3 = 0x127ff 
a2 = 0x127fd
a5 = 0x11800
a6 = 0x11801

Which is wrong.

We can't detect it at link time because both relocations are being processed
correctly, they just don't know about each other (and really can't, because
there's nothing coupling them together).

> Linker relaxation not coming in and relaxing it to be use gp offsets instead.
> It is one of the worst parts of the riscv toolchain ...

Though this time linker relaxation is actually biting us twice:

First, it's masking this problem for small programs: if these accesses are all
within range of GP we end up producing executables that function fine, as the
relaxation calculates the full addresses to use as GP offsets.

Second, the GP relaxations just don't work when we share LUIs for
possibly-misaligned symbols because we delete the LUI if the first low-half is
within GP range.  For example:

$ cat glob.s 
.section .sdata
.global empty
empty:
.rep 4090
.byte 0
.endr
.global glob
glob:
.byte 1, 2, 3, 4
$ riscv64-linux-gnu-gcc test.c glob.s -O3 -o test -static -fno-PIE
-mcmodel=medlow -mexplicit-relocs --save-temps -nostdlib
$ riscv64-linux-gnu-objdump -d test
...
000000000001010c <_start>:
   1010c:       7fb1c703                lbu     a4,2043(gp) # 12127 <glob+0x1>
   10110:       7fa1c603                lbu     a2,2042(gp) # 12126 <glob>
   10114:       1286c783                lbu     a5,296(a3)
   10118:       1296c503                lbu     a0,297(a3)
...

We had that problem with the AUIPC->GP relaxation as well, but could fix it
there because the low half points to the high half.  Here I think there's also
nothing we can do in the linker, as there's no way to tell when the result of
the LUI is completely unused -- we could deal with simple cases like this, but
with control flow there's no way to handle all of them.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug middle-end/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (7 preceding siblings ...)
  2022-09-02 23:59 ` palmer at gcc dot gnu.org
@ 2022-09-05 10:15 ` rguenth at gcc dot gnu.org
  2022-10-21 16:22 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-09-05 10:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to baoshan from comment #5)
> Per Andrew's request:
> 
> For GCC built for RISC-V,
> With the following code:
> struct sss_t {
>     int i;
>     int j;
> } sss;
> extern char array[sizeof(struct sss_t )];
> void foo()
> {
>     struct sss_t *p = (struct sss_t *)array;
>     p->i = 10;

Note we should know here that p->i = 10 accesses memory aligned
according to the alignment of an 'int' (unless riscv is packed by
default and doesn't have 'sss' aligned to an 'int').

That's because C guarantees this.

We lose this fact because get_object_alignment_2 "knows better", aka
in some places GCC tries to work like a DWIM compiler, saving people
from mistakes.  Here get_inner_reference sees we access 'array' and
adjusts alignment according to its alignment, ignoring what the language
standard guarantees.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug middle-end/106818] code is genereated differently with or without 'extern'
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (8 preceding siblings ...)
  2022-09-05 10:15 ` rguenth at gcc dot gnu.org
@ 2022-10-21 16:22 ` pinskia at gcc dot gnu.org
  2022-10-21 16:31 ` [Bug target/106818] riscv produces bad low_sum while doing expansion of strict aligned stores/load pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-10-21 16:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 53745
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53745&action=edit
Testcase

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/106818] riscv produces bad low_sum while doing expansion of strict aligned stores/load
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (9 preceding siblings ...)
  2022-10-21 16:22 ` pinskia at gcc dot gnu.org
@ 2022-10-21 16:31 ` pinskia at gcc dot gnu.org
  2022-10-21 16:33 ` pinskia at gcc dot gnu.org
  2023-11-29 18:45 ` pinskia at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-10-21 16:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-10-21
             Status|UNCONFIRMED                 |NEW
            Summary|code is genereated          |riscv produces bad low_sum
                   |differently with or without |while doing expansion of
                   |'extern'                    |strict aligned stores/load
          Component|middle-end                  |target
     Ever confirmed|0                           |1
           Keywords|missed-optimization         |wrong-code

--- Comment #11 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So on aarch64 with -O3 -mstrict-align we produce:

        adrp    x1, array
        add     x0, x1, :lo12:array
        mov     w2, 10
        strb    w2, [x1, #:lo12:array]
        strb    wzr, [x0, 1]
        strb    wzr, [x0, 2]
        strb    wzr, [x0, 3]

PowerPC has a similar output as aarch64 here too.

Which means this is a target issue where low_sum is being combined with the add
when it should not be.

Confirmed for riscv.

It happens at expand time:
(insn 15 14 16 (set (mem/c:QI (lo_sum:DI (reg:DI 72)
                (symbol_ref:DI ("array") [flags 0xc4]  <var_decl 0x7f347a4c51b0
array>)) [1 MEM[(struct sss_t *)&array].i+0 S1 A8])
        (reg:QI 78)) "/app/example.cpp":10:10 -1
     (nil))

(insn 16 15 17 (set (reg:DI 80)
        (zero_extend:DI (mem/c:QI (lo_sum:DI (reg:DI 72)
                    (const:DI (plus:DI (symbol_ref:DI ("array") [flags 0xc4] 
<var_decl 0x7f347a4c51b0 array>)
                            (const_int 1 [0x1])))) [1 MEM[(struct sss_t
*)&array].i+1 S1 A8]))) "/app/example.cpp":10:10 -1
     (nil))

Note the alignment is causing the difference in doing the expansion or one
store but then the wrong code is due to the backend expansion of the address
incorrectly.

ARM64 (and powerpc) expansion produces:
(insn 18 17 19 (set (reg:QI 101)
        (mem/c:QI (plus:DI (reg/f:DI 92)
                (const_int 1 [0x1])) [1 MEM[(struct sss_t *)&array].i+1 S1
A8])) "/app/example.cpp":10:10 -1
     (nil))

Notice the 92 rather than the low_sum part.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/106818] riscv produces bad low_sum while doing expansion of strict aligned stores/load
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (10 preceding siblings ...)
  2022-10-21 16:31 ` [Bug target/106818] riscv produces bad low_sum while doing expansion of strict aligned stores/load pinskia at gcc dot gnu.org
@ 2022-10-21 16:33 ` pinskia at gcc dot gnu.org
  2023-11-29 18:45 ` pinskia at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-10-21 16:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
MIPS also does the correct thing (it has store left/right):

(insn 10 9 11 (set (reg/f:SI 196)
        (lo_sum:SI (reg:SI 194)
            (symbol_ref:SI ("array") [flags 0x40]  <var_decl 0x7f1c8d421510
array>))) "/app/example.cpp":10:10 -1
     (nil))

(insn 11 10 12 (set (mem/c:BLK (lo_sum:SI (reg:SI 194)
                (symbol_ref:SI ("array") [flags 0x40]  <var_decl 0x7f1c8d421510
array>)) [1 MEM[(struct sss_t *)&array].i+0 S4 A8])
        (unspec:BLK [
                (reg:SI 195)
                (mem/c:QI (lo_sum:SI (reg:SI 194)
                        (symbol_ref:SI ("array") [flags 0x40]  <var_decl
0x7f1c8d421510 array>)) [1 MEM[(struct sss_t *)&array].i+0 S1 A8])
            ] UNSPEC_STORE_LEFT)) "/app/example.cpp":10:10 -1
     (nil))

(insn 12 11 0 (set (mem/c:BLK (lo_sum:SI (reg:SI 194)
                (symbol_ref:SI ("array") [flags 0x40]  <var_decl 0x7f1c8d421510
array>)) [1 MEM[(struct sss_t *)&array].i+0 S4 A8])
        (unspec:BLK [
                (reg:SI 195)
                (mem/c:QI (plus:SI (reg/f:SI 196)
                        (const_int 3 [0x3])) [1 MEM[(struct sss_t *)&array].i+3
S1 A8])
                (mem/c:BLK (lo_sum:SI (reg:SI 194)
                        (symbol_ref:SI ("array") [flags 0x40]  <var_decl
0x7f1c8d421510 array>)) [1 MEM[(struct sss_t *)&array].i+0 S4 A8])
            ] UNSPEC_STORE_RIGHT)) "/app/example.cpp":10:10 -1
     (nil))

But you can see the the plus of reg196 and 3 and reg196 was set to the low_sum
and such.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/106818] riscv produces bad low_sum while doing expansion of strict aligned stores/load
  2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
                   ` (11 preceding siblings ...)
  2022-10-21 16:33 ` pinskia at gcc dot gnu.org
@ 2023-11-29 18:45 ` pinskia at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-29 18:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |DUPLICATE

--- Comment #13 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Dup of bug 100604.

*** This bug has been marked as a duplicate of bug 100604 ***

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-11-29 18:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-02 19:26 [Bug c/106818] New: code is genereated differently with or without 'extern' pangbw at gmail dot com
2022-09-02 19:28 ` [Bug c/106818] " pangbw at gmail dot com
2022-09-02 19:37 ` pinskia at gcc dot gnu.org
2022-09-02 19:38 ` [Bug middle-end/106818] " pinskia at gcc dot gnu.org
2022-09-02 20:03 ` palmer at gcc dot gnu.org
2022-09-02 22:29 ` pangbw at gmail dot com
2022-09-02 22:33 ` pangbw at gmail dot com
2022-09-02 22:37 ` pinskia at gcc dot gnu.org
2022-09-02 23:59 ` palmer at gcc dot gnu.org
2022-09-05 10:15 ` rguenth at gcc dot gnu.org
2022-10-21 16:22 ` pinskia at gcc dot gnu.org
2022-10-21 16:31 ` [Bug target/106818] riscv produces bad low_sum while doing expansion of strict aligned stores/load pinskia at gcc dot gnu.org
2022-10-21 16:33 ` pinskia at gcc dot gnu.org
2023-11-29 18:45 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).