public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error
@ 2022-11-23 15:04 czx211355007 at gmail dot com
2022-11-23 15:11 ` [Bug c/107836] " pinskia at gcc dot gnu.org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: czx211355007 at gmail dot com @ 2022-11-23 15:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836
Bug ID: 107836
Summary: x86_64 inline functions -O2/-O3 optimization error
Product: gcc
Version: 11.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: czx211355007 at gmail dot com
Target Milestone: ---
Target: x86_64-linux-gnu
Created attachment 53952
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53952&action=edit
full assembly for function "matrix_mul"
When compiling the following two functions with -O2 or -O3 options, the
assembly code generated is wrong.
int dot_product(short* a, short* b, int len){
int result;
asm("pandn %%mm5,%%mm5;"::);
for(int i=0; i < len; i += 4){
asm(
"movq %0,%%mm0;"
"movq %1,%%mm1;"
"pmaddwd %%mm1,%%mm0;"
"paddd %%mm0,%%mm5;"
:
: "m" (a[i]), "m" (b[i])
);
}
asm("movq %%mm5, %%mm0;"
"psrlq $32,%%mm5;"
"paddd %%mm0, %%mm5;"
"movd %%mm5,%0;"
"emms"
:"=r" (result)
:);
return result;
}
}
void matrix_mul(int d, short a[d][d], short b[d][d], int c[d][d]){
for(int i=0;i<d;i++){
for(int j=0;j<d;j++){
c[i][j] = dot_product(a[i], b[j], d);
}
}
return;
}
The part of the assembly code for "matrix_mul" where I see an error:
14b5: 0f 6f c5 movq %mm5,%mm0
14b8: 0f 73 d5 20 psrlq $0x20,%mm5
14bc: 0f fe e8 paddd %mm0,%mm5
14bf: 0f 7e eb movd %mm5,%ebx
14c2: 0f 77 emms
14c4: 0f 1f 40 00 nopl 0x0(%rax)
14c8: 4b 8d 34 0e lea (%r14,%r9,1),%rsi
14cc: 4b 8d 4c 05 00 lea 0x0(%r13,%r8,1),%rcx
14d1: 31 ff xor %edi,%edi
14d3: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
14d8: 0f df ed pandn %mm5,%mm5
14db: 49 8d 14 3b lea (%r11,%rdi,1),%rdx
14df: 4c 89 c0 mov %r8,%rax
14e2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
14e8: 0f 6f 00 movq (%rax),%mm0
14eb: 0f 6f 0a movq (%rdx),%mm1
Here mm0 and mm5 are used before values are assigned to mm0 and mm1, which
leads to a calculation error when using "matrix_mul" to do matrix
multiplication.
In addition, when using a low optimization level to compile, there is no error
and it's able to get correct results.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug c/107836] x86_64 inline functions -O2/-O3 optimization error
2022-11-23 15:04 [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error czx211355007 at gmail dot com
@ 2022-11-23 15:11 ` pinskia at gcc dot gnu.org
2022-11-23 15:48 ` schwab@linux-m68k.org
2022-11-24 8:36 ` czx211355007 at gmail dot com
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-23 15:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |inline-asm
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Your inline-asm is missing some clubbers and I don't think you use inline-asm
this way where you keep around a value inside mm5 since the compiler does not
know you did that.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug c/107836] x86_64 inline functions -O2/-O3 optimization error
2022-11-23 15:04 [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error czx211355007 at gmail dot com
2022-11-23 15:11 ` [Bug c/107836] " pinskia at gcc dot gnu.org
@ 2022-11-23 15:48 ` schwab@linux-m68k.org
2022-11-24 8:36 ` czx211355007 at gmail dot com
2 siblings, 0 replies; 4+ messages in thread
From: schwab@linux-m68k.org @ 2022-11-23 15:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836
Andreas Schwab <schwab@linux-m68k.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
Status|UNCONFIRMED |RESOLVED
--- Comment #2 from Andreas Schwab <schwab@linux-m68k.org> ---
There is no dependency whatsoever between the asm statements, thus they can be
moved around freely. Especially the third one is producing a constant output as
seen by the compiler, thus moving it to the top of the function is perfectly
valid.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug c/107836] x86_64 inline functions -O2/-O3 optimization error
2022-11-23 15:04 [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error czx211355007 at gmail dot com
2022-11-23 15:11 ` [Bug c/107836] " pinskia at gcc dot gnu.org
2022-11-23 15:48 ` schwab@linux-m68k.org
@ 2022-11-24 8:36 ` czx211355007 at gmail dot com
2 siblings, 0 replies; 4+ messages in thread
From: czx211355007 at gmail dot com @ 2022-11-24 8:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836
--- Comment #3 from Zixuan Chen <czx211355007 at gmail dot com> ---
I think there is a data dependency between the second asm statement and the
third, a read-after-write one. If the third one is moved to the top then we
can't get the correct value of mm5(mm0). Also, could you explain why the
result using -O1 to compile is correct as expected where the asm statements
remain in the same order as they should be?
schwab@linux-m68k.org <gcc-bugzilla@gcc.gnu.org> 于2022年11月23日周三 23:48写道:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836
>
> Andreas Schwab <schwab@linux-m68k.org> changed:
>
> What |Removed |Added
>
> ----------------------------------------------------------------------------
> Resolution|--- |INVALID
> Status|UNCONFIRMED |RESOLVED
>
> --- Comment #2 from Andreas Schwab <schwab@linux-m68k.org> ---
> There is no dependency whatsoever between the asm statements, thus they
> can be
> moved around freely. Especially the third one is producing a constant
> output as
> seen by the compiler, thus moving it to the top of the function is
> perfectly
> valid.
>
> --
> You are receiving this mail because:
> You reported the bug.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-11-24 8:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23 15:04 [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error czx211355007 at gmail dot com
2022-11-23 15:11 ` [Bug c/107836] " pinskia at gcc dot gnu.org
2022-11-23 15:48 ` schwab@linux-m68k.org
2022-11-24 8:36 ` czx211355007 at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).