public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error @ 2022-11-23 15:04 czx211355007 at gmail dot com 2022-11-23 15:11 ` [Bug c/107836] " pinskia at gcc dot gnu.org ` (2 more replies) 0 siblings, 3 replies; 4+ messages in thread From: czx211355007 at gmail dot com @ 2022-11-23 15:04 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836 Bug ID: 107836 Summary: x86_64 inline functions -O2/-O3 optimization error Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: czx211355007 at gmail dot com Target Milestone: --- Target: x86_64-linux-gnu Created attachment 53952 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53952&action=edit full assembly for function "matrix_mul" When compiling the following two functions with -O2 or -O3 options, the assembly code generated is wrong. int dot_product(short* a, short* b, int len){ int result; asm("pandn %%mm5,%%mm5;"::); for(int i=0; i < len; i += 4){ asm( "movq %0,%%mm0;" "movq %1,%%mm1;" "pmaddwd %%mm1,%%mm0;" "paddd %%mm0,%%mm5;" : : "m" (a[i]), "m" (b[i]) ); } asm("movq %%mm5, %%mm0;" "psrlq $32,%%mm5;" "paddd %%mm0, %%mm5;" "movd %%mm5,%0;" "emms" :"=r" (result) :); return result; } } void matrix_mul(int d, short a[d][d], short b[d][d], int c[d][d]){ for(int i=0;i<d;i++){ for(int j=0;j<d;j++){ c[i][j] = dot_product(a[i], b[j], d); } } return; } The part of the assembly code for "matrix_mul" where I see an error: 14b5: 0f 6f c5 movq %mm5,%mm0 14b8: 0f 73 d5 20 psrlq $0x20,%mm5 14bc: 0f fe e8 paddd %mm0,%mm5 14bf: 0f 7e eb movd %mm5,%ebx 14c2: 0f 77 emms 14c4: 0f 1f 40 00 nopl 0x0(%rax) 14c8: 4b 8d 34 0e lea (%r14,%r9,1),%rsi 14cc: 4b 8d 4c 05 00 lea 0x0(%r13,%r8,1),%rcx 14d1: 31 ff xor %edi,%edi 14d3: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 14d8: 0f df ed pandn %mm5,%mm5 14db: 49 8d 14 3b lea (%r11,%rdi,1),%rdx 14df: 4c 89 c0 mov %r8,%rax 14e2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 14e8: 0f 6f 00 movq (%rax),%mm0 14eb: 0f 6f 0a movq (%rdx),%mm1 Here mm0 and mm5 are used before values are assigned to mm0 and mm1, which leads to a calculation error when using "matrix_mul" to do matrix multiplication. In addition, when using a low optimization level to compile, there is no error and it's able to get correct results. ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug c/107836] x86_64 inline functions -O2/-O3 optimization error 2022-11-23 15:04 [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error czx211355007 at gmail dot com @ 2022-11-23 15:11 ` pinskia at gcc dot gnu.org 2022-11-23 15:48 ` schwab@linux-m68k.org 2022-11-24 8:36 ` czx211355007 at gmail dot com 2 siblings, 0 replies; 4+ messages in thread From: pinskia at gcc dot gnu.org @ 2022-11-23 15:11 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |inline-asm --- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Your inline-asm is missing some clubbers and I don't think you use inline-asm this way where you keep around a value inside mm5 since the compiler does not know you did that. ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug c/107836] x86_64 inline functions -O2/-O3 optimization error 2022-11-23 15:04 [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error czx211355007 at gmail dot com 2022-11-23 15:11 ` [Bug c/107836] " pinskia at gcc dot gnu.org @ 2022-11-23 15:48 ` schwab@linux-m68k.org 2022-11-24 8:36 ` czx211355007 at gmail dot com 2 siblings, 0 replies; 4+ messages in thread From: schwab@linux-m68k.org @ 2022-11-23 15:48 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836 Andreas Schwab <schwab@linux-m68k.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #2 from Andreas Schwab <schwab@linux-m68k.org> --- There is no dependency whatsoever between the asm statements, thus they can be moved around freely. Especially the third one is producing a constant output as seen by the compiler, thus moving it to the top of the function is perfectly valid. ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug c/107836] x86_64 inline functions -O2/-O3 optimization error 2022-11-23 15:04 [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error czx211355007 at gmail dot com 2022-11-23 15:11 ` [Bug c/107836] " pinskia at gcc dot gnu.org 2022-11-23 15:48 ` schwab@linux-m68k.org @ 2022-11-24 8:36 ` czx211355007 at gmail dot com 2 siblings, 0 replies; 4+ messages in thread From: czx211355007 at gmail dot com @ 2022-11-24 8:36 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836 --- Comment #3 from Zixuan Chen <czx211355007 at gmail dot com> --- I think there is a data dependency between the second asm statement and the third, a read-after-write one. If the third one is moved to the top then we can't get the correct value of mm5(mm0). Also, could you explain why the result using -O1 to compile is correct as expected where the asm statements remain in the same order as they should be? schwab@linux-m68k.org <gcc-bugzilla@gcc.gnu.org> 于2022年11月23日周三 23:48写道: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107836 > > Andreas Schwab <schwab@linux-m68k.org> changed: > > What |Removed |Added > > ---------------------------------------------------------------------------- > Resolution|--- |INVALID > Status|UNCONFIRMED |RESOLVED > > --- Comment #2 from Andreas Schwab <schwab@linux-m68k.org> --- > There is no dependency whatsoever between the asm statements, thus they > can be > moved around freely. Especially the third one is producing a constant > output as > seen by the compiler, thus moving it to the top of the function is > perfectly > valid. > > -- > You are receiving this mail because: > You reported the bug. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-11-24 8:36 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-23 15:04 [Bug c/107836] New: x86_64 inline functions -O2/-O3 optimization error czx211355007 at gmail dot com 2022-11-23 15:11 ` [Bug c/107836] " pinskia at gcc dot gnu.org 2022-11-23 15:48 ` schwab@linux-m68k.org 2022-11-24 8:36 ` czx211355007 at gmail dot com
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).