public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/52574] New: [4.6 Regression] gcc tree optimizer generates incorrect vector load instructions for x86_64, app crashes
@ 2012-03-13 1:20 doko at gcc dot gnu.org
2012-03-13 7:13 ` [Bug tree-optimization/52574] " jakub at gcc dot gnu.org
2012-03-13 20:57 ` deepak.ravi at gmail dot com
0 siblings, 2 replies; 3+ messages in thread
From: doko at gcc dot gnu.org @ 2012-03-13 1:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52574
Bug #: 52574
Summary: [4.6 Regression] gcc tree optimizer generates
incorrect vector load instructions for x86_64, app
crashes
Classification: Unclassified
Product: gcc
Version: 4.6.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: doko@gcc.gnu.org
[forwarded from http://bugs.debian.org/663654]
The following versions of gcc:
Debian gcc-4.6.3-1,
Debain gcc-4.4.6-14,
Debian gcc-4.6.2-14,
Debian gcc-4.4.6-15,
Ubuntu 4.4.3-4ubuntu5
generates *wrong* code - aligned vector loads instead of unaligned vector loads
for x86_64 arch. This causes the compiled code to crash with
SIGSEGV(General Protection Fault).
Bug *not* present on trunk and gcc-4.5.3-12.
Consider the following program:
void foo(int* __restrict ia, int n){
int i;
for(i=0;i<n;i++){
ia[i]=ia[i]*ia[i];
}
}
int main(){
int a[9];
int sum=0,i;
for(i=0;i<9;i++){
a[i]=(i*i)%128;
}
foo((int*)((char*)a+2), 8);
for(i=0;i<9;i++){
sum+=a[i];
}
return sum;
}
In x86 and x86_64, unaligned word access are valid
- *((int*)<unaligned memory address>)
But x86_64 SSE has two kinds of vector instructions
- aligned vector move (movdqa)
- unaligned vector move (movdqu)
Use of aligned vector move with an unaligned vector address,
will trigger the application to crash.
When compiled with any of the following command lines:
gcc -O3 foo.c
g++ -O3 foo.c
gcc -m64 -O2 -ftree-vectorize gcc_bug.c
g++ -m64 -O2 -ftree-vectorize gcc_bug.c
gcc generates an aligned vector load
movdqa -54(%rsp,%rax), %xmm0
instead of unaligned vector load - movdqu.
This result in above application to crash with
SIGSEGV(General Protection Fault).
gcc-4.7 correctly generates
movdqu -54(%rsp), %xmm0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug tree-optimization/52574] [4.6 Regression] gcc tree optimizer generates incorrect vector load instructions for x86_64, app crashes
2012-03-13 1:20 [Bug tree-optimization/52574] New: [4.6 Regression] gcc tree optimizer generates incorrect vector load instructions for x86_64, app crashes doko at gcc dot gnu.org
@ 2012-03-13 7:13 ` jakub at gcc dot gnu.org
2012-03-13 20:57 ` deepak.ravi at gmail dot com
1 sibling, 0 replies; 3+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-03-13 7:13 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52574
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
CC| |jakub at gcc dot gnu.org
Resolution| |INVALID
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-03-13 07:12:55 UTC ---
The testcase is invalid C, while x86_64/i?86 will do the expected thing of
doing unaligned loads/stores silently, it won't do that in vectorized code or
for atomic accesses. You need to tell the compiler that ia isn't aligned
through aligned attribute. E.g. typedef int T __attribute__((aligned (2)));
and using T *__restrict ia instead of int *__restrict ia.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug tree-optimization/52574] [4.6 Regression] gcc tree optimizer generates incorrect vector load instructions for x86_64, app crashes
2012-03-13 1:20 [Bug tree-optimization/52574] New: [4.6 Regression] gcc tree optimizer generates incorrect vector load instructions for x86_64, app crashes doko at gcc dot gnu.org
2012-03-13 7:13 ` [Bug tree-optimization/52574] " jakub at gcc dot gnu.org
@ 2012-03-13 20:57 ` deepak.ravi at gmail dot com
1 sibling, 0 replies; 3+ messages in thread
From: deepak.ravi at gmail dot com @ 2012-03-13 20:57 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52574
Deepak Ravi <deepak.ravi at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |deepak.ravi at gmail dot
| |com
--- Comment #2 from Deepak Ravi <deepak.ravi at gmail dot com> 2012-03-13 19:18:45 UTC ---
(In reply to comment #1)
> The testcase is invalid C, while x86_64/i?86 will do the expected thing of
> doing unaligned loads/stores silently, it won't do that in vectorized code or
> for atomic accesses.
Shouldn't the compiler vectorize the code _conservatively_, by generating code
to check if the address is aligned or generating unaligned vector load
instructions, as any code written for x86_64 will break with -O3, with newer
gcc.
Also note that, this bug will get triggered only when __restricted is used. If
you remove __restricted, gcc is generating proper code. Also it works properly
for gcc 4.7 too (even with __restricted).
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-03-13 19:19 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-13 1:20 [Bug tree-optimization/52574] New: [4.6 Regression] gcc tree optimizer generates incorrect vector load instructions for x86_64, app crashes doko at gcc dot gnu.org
2012-03-13 7:13 ` [Bug tree-optimization/52574] " jakub at gcc dot gnu.org
2012-03-13 20:57 ` deepak.ravi at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).