public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps)
@ 2004-06-21 11:42 djp at volny dot cz
2004-06-21 15:09 ` [Bug target/16111] " bangerth at dealii dot org
` (11 more replies)
0 siblings, 12 replies; 13+ messages in thread
From: djp at volny dot cz @ 2004-06-21 11:42 UTC (permalink / raw)
To: gcc-bugs
My project stops working after switching from 3.3.3 to 3.4.0; I've found that
the problem is in SSE2; in some (register intensive) code the compiler generates
movdqa (packed ints) instead of movaps (float4 vector) causing invalid result (NaN).
I've created a simple test case to demonstrate the problem:
--------------------------------------------------------------------
#include <math.h>
#include <xmmintrin.h>
#include <stdio.h>
static inline __m128 xmm_dot4(__m128 a, __m128 b) {
__m128 v0 = _mm_mul_ps(a, b);
__m128 v1 = _mm_movehl_ps(v1, v0);
v0 = _mm_add_ps(v0, v1);
v1 = _mm_shuffle_ps(v0, v0, _MM_SHUFFLE(0,0,0,1));
return _mm_add_ss(v0, v1);
}
/*
* gcc-3.4.0+ generates invalid movdqa instruction;
* works well if you replace movdqa by movaps in asm.
*/
void foo(float* boxCenter, float* boxExtents)
{
unsigned int MASK = 0x80000000;
__m128 mask = _mm_set1_ps((float&)MASK);
__m128 center = _mm_loadu_ps(boxCenter);
__m128 extents = _mm_loadu_ps(boxExtents);
center = _mm_andnot_ps(mask, center); // common code for doing abs
extents = _mm_xor_ps(mask, extents); // common code for doing neg
center = xmm_dot4(center, extents);
_mm_storeu_ps(boxCenter, center);
_mm_storeu_ps(boxExtents, extents);
}
float center[] = { 1, 1, 1, 1 };
float extents[] = { 27.5f, 27.5f, 0, 0 };
int main()
{
foo(center, extents);
printf("extents (%f %f %f %f)\n", extents[0], extents[1], extents[2],
extents[3]); // prints NaN
return 0;
}
--------------------------------------------------------------------
I've tried both 3.4.0 release, and latest CVS snapshot:
/opt/gcc-3.4.0/bin/g++-3.4 -v -save-temps -O3 -msse -mfpmath=sse
-fomit-frame-pointer -finline-limit=2000 "test.cxx" -o "test" -L/opt/gcc-3.4.0/lib
Reading specs from /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/specs
Configured with: ../configure --prefix=/opt/gcc-3.4.0 --program-suffix=-3.4
--enable-languages=c,c++,java --with-gcc --with-gnu-as --with-gnu-ld
--enable-shared --enable-threads=posix --enable-libgcj --disable-java-awt
--without-x --enable-java-gc=boehm --disable-debug --disable-libgcj-debug
--disable-interpreter --disable-x --enable-hash-synchronization
Thread model: posix
gcc version 3.4.1 20040618 (prerelease)
/opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1plus -E -quiet -v
-D_GNU_SOURCE test.cxx -msse -mfpmath=sse -mtune=pentiumpro -fomit-frame-pointer
-finline-limit=2000 -O3 -o test.ii
ignoring nonexistent directory
"/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1/i686-pc-linux-gnu
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1/backward
/usr/local/include
/opt/gcc-3.4.0/include
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/include
/usr/include
End of search list.
/opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1plus -fpreprocessed
test.ii -quiet -dumpbase test.cxx -msse -mfpmath=sse -mtune=pentiumpro -auxbase
test -O3 -version -fomit-frame-pointer -finline-limit=2000 -o test.s
GNU C++ version 3.4.1 20040618 (prerelease) (i686-pc-linux-gnu)
compiled by GNU C version 3.3.3 (Debian 20040321).
GGC heuristics: --param ggc-min-expand=90 --param ggc-min-heapsize=113152
as -V -Qy -o test.o test.s
GNU assembler version 2.14.90.0.7 (i386-linux) using BFD version 2.14.90.0.7
20031029 Debian GNU/Linux
/opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/collect2 --eh-frame-hdr -m
elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test /usr/lib/crt1.o
/usr/lib/crti.o /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/crtbegin.o
-L/opt/gcc-3.4.0/lib -L/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1
-L/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../.. test.o -lstdc++ -lm
-lgcc_s -lgcc -lc -lgcc_s -lgcc
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/crtend.o /usr/lib/crtn.o
/opt/gcc-3.4.0/bin/g++-3.4 -v -save-temps -O3 -msse -mfpmath=sse
-fomit-frame-pointer -finline-limit=2000 "test.cxx" -S -o "test.S"
-L/opt/gcc-3.4.0/lib
Reading specs from /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/specs
Configured with: ../configure --prefix=/opt/gcc-3.4.0 --program-suffix=-3.4
--enable-languages=c,c++,java --with-gcc --with-gnu-as --with-gnu-ld
--enable-shared --enable-threads=posix --enable-libgcj --disable-java-awt
--without-x --enable-java-gc=boehm --disable-debug --disable-libgcj-debug
--disable-interpreter --disable-x --enable-hash-synchronization
Thread model: posix
gcc version 3.4.1 20040618 (prerelease)
/opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1plus -E -quiet -v
-D_GNU_SOURCE test.cxx -msse -mfpmath=sse -mtune=pentiumpro -fomit-frame-pointer
-finline-limit=2000 -O3 -o test.ii
ignoring nonexistent directory
"/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1/i686-pc-linux-gnu
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1/backward
/usr/local/include
/opt/gcc-3.4.0/include
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/include
/usr/include
End of search list.
/opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1plus -fpreprocessed
test.ii -quiet -dumpbase test.cxx -msse -mfpmath=sse -mtune=pentiumpro
-auxbase-strip test.S -O3 -version -fomit-frame-pointer -finline-limit=2000 -o
test.S
GNU C++ version 3.4.1 20040618 (prerelease) (i686-pc-linux-gnu)
compiled by GNU C version 3.3.3 (Debian 20040321).
GGC heuristics: --param ggc-min-expand=90 --param ggc-min-heapsize=113152
--------------------------------------------------------------------
And the result:
.file "test.cxx"
.globl extents
.data
.align 4
.type extents, @object
.size extents, 16
extents:
.long 1104936960
.long 1104936960
.long 0
.long 0
.globl center
.align 4
.type center, @object
.size center, 16
center:
.long 1065353216
.long 1065353216
.long 1065353216
.long 1065353216
.text
.align 2
.p2align 4,,15
.globl _Z3fooPfS_
.type _Z3fooPfS_, @function
_Z3fooPfS_:
.LFB312:
subl $4, %esp
.LCFI0:
movl 8(%esp), %eax
movl $0x80000000, (%esp)
movl 12(%esp), %edx
movss (%esp), %xmm1
movups (%eax), %xmm6
movups (%edx), %xmm5
shufps $0, %xmm1, %xmm1
movdqa %xmm1, %xmm4
andnps %xmm6, %xmm4
xorps %xmm5, %xmm1
movaps %xmm4, %xmm0
mulps %xmm1, %xmm0
movhlps %xmm0, %xmm3
addps %xmm3, %xmm0
movaps %xmm0, %xmm2
shufps $1, %xmm0, %xmm2
addss %xmm2, %xmm0
movups %xmm0, (%eax)
movups %xmm1, (%edx)
popl %eax
ret
.LFE312:
.size _Z3fooPfS_, .-_Z3fooPfS_
.section .rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "extents (%f %f %f %f)\n"
.text
.align 2
.p2align 4,,15
.globl main
.type main, @function
main:
.LFB313:
pushl %ebp
.LCFI1:
movl %esp, %ebp
.LCFI2:
subl $40, %esp
.LCFI3:
movl $0x80000000, -4(%ebp)
movups extents, %xmm5
andl $-16, %esp
subl $16, %esp
movss -4(%ebp), %xmm1
movups center, %xmm6
movl $.LC0, (%esp)
shufps $0, %xmm1, %xmm1
movdqa %xmm1, %xmm4
xorps %xmm5, %xmm1
andnps %xmm6, %xmm4
movaps %xmm4, %xmm0
movups %xmm1, extents
mulps %xmm1, %xmm0
movhlps %xmm0, %xmm3
flds extents+12
addps %xmm3, %xmm0
movaps %xmm0, %xmm2
shufps $1, %xmm0, %xmm2
addss %xmm2, %xmm0
fstpl 28(%esp)
flds extents+8
movups %xmm0, center
fstpl 20(%esp)
flds extents+4
fstpl 12(%esp)
flds extents
fstpl 4(%esp)
call printf
leave
xorl %eax, %eax
ret
.LFE313:
.size main, .-main
.section .eh_frame,"a",@progbits
.Lframe1:
.long .LECIE1-.LSCIE1
.LSCIE1:
.long 0x0
.byte 0x1
.string "zP"
.uleb128 0x1
.sleb128 -4
.byte 0x8
.uleb128 0x5
.byte 0x0
.long __gxx_personality_v0
.byte 0xc
.uleb128 0x4
.uleb128 0x4
.byte 0x88
.uleb128 0x1
.align 4
.LECIE1:
.LSFDE3:
.long .LEFDE3-.LASFDE3
.LASFDE3:
.long .LASFDE3-.Lframe1
.long .LFB313
.long .LFE313-.LFB313
.uleb128 0x0
.byte 0x4
.long .LCFI1-.LFB313
.byte 0xe
.uleb128 0x8
.byte 0x85
.uleb128 0x2
.byte 0x4
.long .LCFI2-.LCFI1
.byte 0xd
.uleb128 0x5
.align 4
.LEFDE3:
.section .note.GNU-stack,"",@progbits
.ident "GCC: (GNU) 3.4.1 20040618 (prerelease)"
--------------------------------------------------------------------
Note that replacing movdqa by movaps (or using gcc 3.3.3;-) fixes the problem.
Hope it helps.
--
Summary: generates invalid SSE movdqa instruction (instead of
movaps)
Product: gcc
Version: 3.4.1
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: translation
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: djp at volny dot cz
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i686-pc-linux-gnu
GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
@ 2004-06-21 15:09 ` bangerth at dealii dot org
2004-06-21 16:34 ` pinskia at gcc dot gnu dot org
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: bangerth at dealii dot org @ 2004-06-21 15:09 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Component|translation |target
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
2004-06-21 15:09 ` [Bug target/16111] " bangerth at dealii dot org
@ 2004-06-21 16:34 ` pinskia at gcc dot gnu dot org
2004-06-21 17:18 ` djp at volny dot cz
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-21 16:34 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-06-21 16:34 -------
Hmm this works for on the mainline, 3.4.0, and 3.3.3:
tin:~/src/gnu/gcctest>g++ -O3 -msse -mfpmath=sse -fomit-frame-pointer -finline-limit=2000
pr16111.c
tin:~/src/gnu/gcctest>./a.out
extents (-27.500000 -27.500000 -0.000000 -0.000000)
tin:~/src/gnu/gcctest>~/ia32_linux_gcc3_4/bin/g++ -O3 -msse -mfpmath=sse -fomit-frame-pointer
-finline-limit=2000 pr16111.c
tin:~/src/gnu/gcctest>./a.out
extents (-27.500000 -27.500000 -0.000000 -0.000000)
tin:~/src/gnu/gcctest>~/ia32_linux_gcc3_3/bin/g++ -O3 -msse -mfpmath=sse -fomit-frame-pointer
-finline-limit=2000 pr16111.c
tin:~/src/gnu/gcctest>./a.out
extents (-27.500000 -27.500000 -0.000000 -0.000000)
--
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |wrong-code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
2004-06-21 15:09 ` [Bug target/16111] " bangerth at dealii dot org
2004-06-21 16:34 ` pinskia at gcc dot gnu dot org
@ 2004-06-21 17:18 ` djp at volny dot cz
2004-06-21 17:40 ` pinskia at gcc dot gnu dot org
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: djp at volny dot cz @ 2004-06-21 17:18 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From djp at volny dot cz 2004-06-21 17:18 -------
did you run the test on amd or intel?
my results and more info:
GCC 3.4.0 (mainline)
/opt/gcc-3.4.0/bin/g++-3.4 -v -save-temps -O3 -msse -mfpmath=sse
-fomit-frame-pointer -finline-limit=2000 "test.cxx" -o "test" -L/opt/gcc-3.4.0/lib
==>
LD_LIBRARY_PATH="/opt/gcc-3.3.3/lib:$LD_LIBRARY_PATH" ./test
extents (-27.500000 -27.500000 -0.000000 nan)
GCC 3.3.3 (mainline)
/opt/gcc-3.3.3/bin/g++-3.3 -v -save-temps -O3 -msse -mfpmath=sse
-fomit-frame-pointer -finline-limit=2000 "test.cxx" -o "test" -L/opt/gcc-3.3.3/lib
==>
LD_LIBRARY_PATH="/opt/gcc-3.4.0/lib:$LD_LIBRARY_PATH" ./test
extents (-27.500000 -27.500000 -0.000000 -0.000000)
root@vox:/proc# cat cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 6
model name : AMD Athlon(tm) XP 2100+
stepping : 2
cpu MHz : 1737.340
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
bogomips : 3432.44
test.s from 3.4.0
-----------------
.type _Z3fooPfS_, @function
_Z3fooPfS_:
.LFB312:
subl $4, %esp
.LCFI0:
movl 8(%esp), %eax
movl $0x80000000, (%esp)
movl 12(%esp), %edx
movss (%esp), %xmm1
movups (%eax), %xmm6
movups (%edx), %xmm5
shufps $0, %xmm1, %xmm1
movdqa %xmm1, %xmm4
andnps %xmm6, %xmm4
xorps %xmm5, %xmm1
movaps %xmm4, %xmm0
mulps %xmm1, %xmm0
movhlps %xmm0, %xmm3
addps %xmm3, %xmm0
movaps %xmm0, %xmm2
shufps $1, %xmm0, %xmm2
addss %xmm2, %xmm0
movups %xmm0, (%eax)
movups %xmm1, (%edx)
popl %eax
ret
test.s from 3.3.3
-----------------
.type _Z3fooPfS_, @function
_Z3fooPfS_:
.LFB314:
subl $4, %esp
.LCFI0:
movl 8(%esp), %edx
movl $0x80000000, (%esp)
movl 12(%esp), %ecx
movss (%esp), %xmm5
movups (%edx), %xmm4
movups (%ecx), %xmm6
shufps $0, %xmm5, %xmm5
movaps %xmm5, %xmm2
andnps %xmm4, %xmm2
xorps %xmm6, %xmm5
movaps %xmm2, %xmm1
mulps %xmm5, %xmm1
movhlps %xmm1, %xmm3
addps %xmm3, %xmm1
movaps %xmm1, %xmm0
shufps $1, %xmm1, %xmm0
addss %xmm0, %xmm1
movups %xmm1, (%edx)
movups %xmm5, (%ecx)
popl %eax
ret
As you can see, the ONLY difference is
movdqa %xmm1, %xmm4
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
` (2 preceding siblings ...)
2004-06-21 17:18 ` djp at volny dot cz
@ 2004-06-21 17:40 ` pinskia at gcc dot gnu dot org
2004-07-25 17:35 ` drober32 at fau dot edu
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-21 17:40 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-06-21 17:40 -------
Mine was a pure intel machine:
tin:~/src/gnu/gcctest>cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.00GHz
stepping : 4
cpu MHz : 1994.146
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts
acpi mmx fxsr sse sse2 ss ht tm
bogomips : 3971.48
So either is is a bug in AMD's sse implemenation which is likely.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
` (3 preceding siblings ...)
2004-06-21 17:40 ` pinskia at gcc dot gnu dot org
@ 2004-07-25 17:35 ` drober32 at fau dot edu
2004-09-21 12:53 ` coyote at coyotegulch dot com
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: drober32 at fau dot edu @ 2004-07-25 17:35 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From drober32 at fau dot edu 2004-07-25 17:35 -------
movdqa is an SSE2 instruction, which is not supported by Pentium 3 or Athlon XP.
The instruction should not be generated unless at least one of -msse2 or
-march=(pentium4, pentium-m, athlon64, etc.) is given.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
` (4 preceding siblings ...)
2004-07-25 17:35 ` drober32 at fau dot edu
@ 2004-09-21 12:53 ` coyote at coyotegulch dot com
2004-12-21 3:41 ` pinskia at gcc dot gnu dot org
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: coyote at coyotegulch dot com @ 2004-09-21 12:53 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From coyote at coyotegulch dot com 2004-09-21 12:53 -------
Is anyone working on this, or should I feel free to tackle it?
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |coyote at coyotegulch dot
| |com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
` (5 preceding siblings ...)
2004-09-21 12:53 ` coyote at coyotegulch dot com
@ 2004-12-21 3:41 ` pinskia at gcc dot gnu dot org
2004-12-21 6:42 ` uros at kss-loka dot si
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-21 3:41 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-21 03:41 -------
Hmm, I wonder if this is fixed now.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
` (6 preceding siblings ...)
2004-12-21 3:41 ` pinskia at gcc dot gnu dot org
@ 2004-12-21 6:42 ` uros at kss-loka dot si
2004-12-21 15:25 ` uros at kss-loka dot si
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: uros at kss-loka dot si @ 2004-12-21 6:42 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From uros at kss-loka dot si 2004-12-21 06:42 -------
Mainline does not generate movdqa insn anymore. However:
g++ -O1 -msse -mfpmath=sse -fomit-frame-pointer -finline-limit=2000 pr16111.cpp
./a.out
extents (-27.500000 -27.500000 -0.000000 -0.000000)
g++ -O2 -msse -mfpmath=sse -fomit-frame-pointer -finline-limit=2000 pr16111.cpp
./a.out
extents (0.000000 0.000000 2.018096 2.018096)
g++ -O3 -msse -mfpmath=sse -fomit-frame-pointer -finline-limit=2000 pr16111.cpp
./a.out
extents (0.000000 0.000000 36.658997 36.658997)
Result is different, it depends on optimization level.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
` (7 preceding siblings ...)
2004-12-21 6:42 ` uros at kss-loka dot si
@ 2004-12-21 15:25 ` uros at kss-loka dot si
2004-12-28 6:33 ` uros at kss-loka dot si
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: uros at kss-loka dot si @ 2004-12-21 15:25 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From uros at kss-loka dot si 2004-12-21 15:25 -------
Does life analysis eat RTLs with -O2?
This part is from pr16111_.cpp.15.cse2:
...
(note 8 7 13 0 NOTE_INSN_FUNCTION_BEG)
(insn 13 8 16 0 (set (mem/i:SI (plus:SI (reg/f:SI 20 frame)
(const_int -4 [0xfffffffc])) [5 MASK+0 S4 A32])
(const_int -2147483648 [0x80000000])) 35 {*movsi_1} (nil)
(nil))
(insn 16 13 17 0 (set (reg:SF 73)
(mem:SF (plus:SI (reg/f:SI 20 frame)
(const_int -4 [0xfffffffc])) [7 S4 A32])) 60 {*movsf_1} (nil)
(nil))
(insn 17 16 21 0 (set (mem/i:SF (plus:SI (reg/f:SI 20 frame)
(const_int -8 [0xfffffff8])) [7 __F+0 S4 A32])
(reg:SF 73)) 60 {*movsf_1} (nil)
(nil))
...
And in pr16111_.cpp.16.life, (insn 13) is just missing. There is no
NOTE_INSN_DELETED, just plain nothing:
...
(note 8 7 16 0 NOTE_INSN_FUNCTION_BEG)
(insn 16 8 17 0 (set (reg:SF 73)
(mem:SF (plus:SI (reg/f:SI 20 frame)
(const_int -4 [0xfffffffc])) [7 S4 A32])) 60 {*movsf_1} (nil)
(nil))
(insn 17 16 21 0 (set (mem/i:SF (plus:SI (reg/f:SI 20 frame)
(const_int -8 [0xfffffff8])) [7 __F+0 S4 A32])
(reg:SF 73)) 60 {*movsf_1} (insn_list:REG_DEP_TRUE 16 (nil))
(expr_list:REG_DEAD (reg:SF 73)
(nil)))
...
Uros.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Last reconfirmed|0000-00-00 00:00:00 |2004-12-21 15:25:19
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
` (8 preceding siblings ...)
2004-12-21 15:25 ` uros at kss-loka dot si
@ 2004-12-28 6:33 ` uros at kss-loka dot si
2004-12-29 0:32 ` rth at gcc dot gnu dot org
2004-12-29 0:35 ` pinskia at gcc dot gnu dot org
11 siblings, 0 replies; 13+ messages in thread
From: uros at kss-loka dot si @ 2004-12-28 6:33 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From uros at kss-loka dot si 2004-12-28 06:33 -------
The original bug is fixed for 4.0.0. The problem described in comment #8 looks
like a problem with aliasing (http://gcc.gnu.org/ml/gcc/2004-12/msg01096.html).
--
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |4.0.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
` (9 preceding siblings ...)
2004-12-28 6:33 ` uros at kss-loka dot si
@ 2004-12-29 0:32 ` rth at gcc dot gnu dot org
2004-12-29 0:35 ` pinskia at gcc dot gnu dot org
11 siblings, 0 replies; 13+ messages in thread
From: rth at gcc dot gnu dot org @ 2004-12-29 0:32 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rth at gcc dot gnu dot org 2004-12-29 00:32 -------
The problem mentioned in comment 8 is not a bug.
(float&)MASK
has the exact same semantics as
*(float *)&MASK
which, as we all ought to know by now, is undefined. Open another PR for
the missing diagnostic with -Wstrict-aliasing if you like, but this one's
fixed.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
Target Milestone|--- |4.0.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
` (10 preceding siblings ...)
2004-12-29 0:32 ` rth at gcc dot gnu dot org
@ 2004-12-29 0:35 ` pinskia at gcc dot gnu dot org
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-29 0:35 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-29 00:35 -------
(In reply to comment #10)>
> which, as we all ought to know by now, is undefined. Open another PR for
> the missing diagnostic with -Wstrict-aliasing if you like, but this one's
> fixed.
PR 14024 is the PR about the aliasing warnings not working in C++.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-12-29 0:35 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
2004-06-21 15:09 ` [Bug target/16111] " bangerth at dealii dot org
2004-06-21 16:34 ` pinskia at gcc dot gnu dot org
2004-06-21 17:18 ` djp at volny dot cz
2004-06-21 17:40 ` pinskia at gcc dot gnu dot org
2004-07-25 17:35 ` drober32 at fau dot edu
2004-09-21 12:53 ` coyote at coyotegulch dot com
2004-12-21 3:41 ` pinskia at gcc dot gnu dot org
2004-12-21 6:42 ` uros at kss-loka dot si
2004-12-21 15:25 ` uros at kss-loka dot si
2004-12-28 6:33 ` uros at kss-loka dot si
2004-12-29 0:32 ` rth at gcc dot gnu dot org
2004-12-29 0:35 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).